Modern society is awash in data. By one estimate, as much information today is created in 48 hours as was produced in the last 30,000 years. The challenge now is making all those megabytes public.
The open data movement has sprouted up to disseminate data from satellites and sensors, in traditional labs and other places, to anyone who wants it. Open data allows scientists to build on experiments faster and easier, and to root out mistakes. It gives the public added confidence in the results. While the 2009 “Climategate” controversy failed to change the basic facts about global warming it did expose the lengths that some scientists, besieged by climate change deniers, went to keep their data secret.
This month, Elsevier, publisher of The Lancet and Cell, announced the start of reciprocal linking between its geochemistry journals and a data library managed by Columbia’s Lamont-Doherty Earth Observatory, called Integrated Earth Data Applications (IEDA) and funded by the National Science Foundation. Eventually, studies in 32 Elsevier geochemistry journals, including Earth-Science Reviews and Earth and Planetary Science Letters, will link to data sets managed by IEDA.
“Through this collaboration, more researchers will be able to access the data and analyze it,” said Kerstin Lehnert, a Lamont-Doherty scientist who heads IEDA. “We’ve built the software tools that let you pull out the data you need instantly.”
In a similar project, Elsevier started reciprocal linking earlier this year with some of its other earth science journals and the PANGAEA data library managed by Germany’s Alfred Wegener Institute for Polar and Marine Research. Though some scientists make their data available in supplementary material, these new collaborations make the data easier to use, said IJsbrand Jan Aalbersberg, vice president of content innovation at Elsevier. “Previously, data was often available but much more hidden,” he said. “This visibility allows the science to be better validated, and re-used, ultimately improving the quality of science at lower cost.”
The geochemistry data now available to Elsevier readers comes through IEDA’s EarthChem portal, co-managed by the University of Kansas. One benefit of open data, say scientists, is the ability to make comparisons across disciplines. Lamont-Doherty geochemist Terry Plank and colleagues are currently analyzing a dataset describing the amount of iron and lead in volcanic rocks collected from California. They recently compared this information to another dataset describing magmas formed in the lab to see if they can understand at what depths and temperatures magmas are forming under the western U.S., where the continent is slowly tearing apart.
In a separate investigation, Plank is looking at volcanism off the coast of South America. She recently scrolled through tens of thousands of rock samples from that region online, but found only a few dozen that included the rocks’ lithium concentrations. It turns out that the same metal used in disposable batteries can also record the movement of sediments during subduction. That data gap has helped Plank identify a new area for fieldwork. “It has totally changed the way we discover things,” she said. “I have always felt that there are secrets buried in data; now we can find them.”
In seismology, scientists credit the shift to open data in the 1980s for major advances in understanding how earthquakes work. More than 100 universities worldwide share earthquake data to monitor faults and verify nuclear test explosions in a collaboration called Incorporated Research Institutions for Seismology (IRIS).
The hidden peaks and valleys of earth’s seafloor have also come into sharper focus through data made public in a partnership between Google Earth and IEDA, which synthesized measurements of the seafloor gathered on hundreds of scientific cruises.
Computers have changed the way science is done. Advocates of open data say sharing can multiply the results. “We’re no longer scientists at a bench studying a rock through a microscope,” said seismologist Art Lerner-Lam, interim director of Lamont. “Modern science requires broad access to information, allowing you to examine things from different perspectives. Removing barriers to data makes innovation possible.”