Feature Story

Opening up about open access data

By Laura Naranjo

The famous detective, Sherlock Holmes, once said, “It is a capital mistake to theorize before you have all the evidence.” Although Holmes is a fictional character, his sentiment resonates with researchers in the real world. Only now, instead of having to hunt and peck for clues to decipher the world around us, scientists can rely on streams of data that pour in from field missions, instruments, and satellites.

But where do they find these data? All the data in the world does little good if scientists cannot access it. For more than twenty-five years, NSIDC has provided open access data, data that are freely and publicly distributed from its Web site, from various remote sensing and field missions. In contrast, some Earth science data around the world are not publicly distributed, or may only be available for a fee. By making these data available to the research community, NSIDC has promoted cryospheric research and given scientists a growing time series that helps us better understand current environmental conditions.

Understanding the pros and cons

MODIS satellite image showing sea ice off the coast of Greenland
The NASA Moderate Resolution Imaging Spectroradiometer (MODIS) sensor captured this view of sea ice off the coast of Greenland in August 2014. NSIDC distributes a variety of MODIS data sets free of charge, which help scientists study sea ice and snow cover across the globe. (NASA image courtesy Jeff Schmaltz, LANCE/EOSDIS MODIS Rapid Response Team at NASA GSFC. Thanks to Claire Parkinson, Kelly Brunt, and Walt Meier for image-interpretation help.)

Allen Pope, a postdoctoral research associate at NSIDC, featured open access polar data in a recently published article. Following a presentation at a workshop for early career scientists, he and his colleagues published a comprehensive list, realizing it could be a valuable resource for the cryospheric research community as a whole. Open access data like those provided by NSIDC are widely used by researchers in many fields, and can be critical for emerging and early career scientists who likely do not have the resources to purchase data, or may even not know which data their research will require.

“If you’re working on a masters or doctoral project,” Pope said, “you might not have funding for data, especially if you didn’t know ahead of time that you’re going to need it.” At the same time, Pope cautions against one of the pitfalls of open access data. “It’s so easy just to grab all this open access data. ‘Oh, there’s tons of it. Let’s just grab it all and get something,’” Pope said. He encouraged new users to become familiar with open access data before trying to extract meaningful information from it. “Depending on which data source you’re using, it can be quite a struggle just to get the data into a format that you can really use and interact with. Every sort of data has a different format, has a different structure,” Pope said.

Getting to know the data

Fortunately, many open access data products are offered with various amounts of pre-processing, or through tools that make it easy to derive a specific subset. These make the data easier to use and apply, especially for researchers new to that data stream. “With many of the larger missions, a lot of the pre-processing is already done for you,” Pope said. For instance, NSIDC offers tools to help researchers find and subset data from the NASA Moderate Resolution and Imaging Spectroradiometer (MODIS) and Advanced Microwave Scanning Radiometer (AMSR-E) missions, both of which are open access.

Scientists can use these tools to download custom grids or map projections in a variety of spatial and temporal resolutions, bypassing the need to process the raw data themselves. Not every open access product comes with the benefit of pre-processing, however. Smaller, investigator-led efforts also often make their data freely available, but are not always funded to process those data in any way. Yet these data still fill a need, and are valuable to other researchers, said Pope. These data may just require more effort on the part of the researcher—or on the part of organizations such as NSIDC, which often ensure data meet certain requirements before archiving or distributing them.

Tools for the future

Global Land Ice Measurements from Space Initiative (GLIMS) image showing glaciers
The Global Land Ice Measurements from Space Initiative (GLIMS) database, hosted at NSIDC, helps researchers map glacier extent and change over time. Although the database requires free registration, users then have access to data from a variety of remote sensing missions. (NASA Earth Observatory image by Jesse Allen and Robert Simmon, using Landsat data from the USGS Earth Explorer. Map by Robert Simmon, using data from the Randolph Glacier Inventory and Natural Earth.)

Pope and his colleagues see a bright future for open-access data. Many organizations like NSIDC are not only helping rescue historic data, they are making these data available free of charge. Likewise, upcoming satellite missions launched by NASA and other agencies will make the resulting data open access. And as the streams of data continue to flow, he stressed the increased need for metadata, or data about the data. This describes the data parameters and levels of processing, and makes it more easily searchable. “It is really critical to know what happened in the processing before you got ahold of it so you know what you’re seeing in the imagery,” Pope said.

“Obviously we at NSIDC are really familiar with that,” Pope said. “And we realize the amount of startup time it can take to get some really interesting information out of a scene.” Today’s Earth scientists may not be solving crimes like Sherlock Holmes did, but they engage in a different sort of detective work, using open access data to solve the mysteries of the natural world. NSIDC has remained at the forefront of archiving and distributing open access data, offering comprehensive metadata as well as tools that help researchers understand the data they have chosen to work with, as Pope recommends. “It’s important just to play around with the data, get a sense of it,” he said. “Make sure you understand them at least from a slightly more intuitive standpoint what’s happening. And that should help get better results from the science as well.”

Reference

Pope, A., W. G. Rees, A. J. Fox, and A. Fleming. 2014. Open access data in polar and cryospheric remote sensing. Remote Sensing 6: 6,183-6,220, doi:10.3390/rs6076183.