Spotlight

Harmonizing data in the Earthdata Cloud

By Audrey Payne

In 2022, the NASA National Snow and Ice Data Center Distributed Active Archive Center (NSIDC DAAC) began the process of migrating its data to the NASA Earthdata Cloud, starting with data sets from the Ice, Cloud, and land Elevation Satellite (ICESat) and Ice, Cloud, and land Elevation Satellite-2 (ICESat-2) collections. NASA Earthdata Cloud is a commercial cloud environment hosted by Amazon Web Services (AWS) that houses NASA’s archive of Earth observations alongside related tools and services from the 12 NASA DAACs, including the NSIDC DAAC. The NASA DAACs are responsible for processing, archiving, and distributing data from Earth's satellites and missions.

The NASA Earthdata Cloud is a scalable solution to the challenges that data users are facing in working with larger and larger volumes of data as the NASA missions continue to expand. Since beginning to migrate data to the cloud, the NSIDC DAAC has worked to develop resources and tools to ensure an easy transition for data users. The newest of those tools to come to the NSIDC DAAC is NASA Earthdata Harmony data transformation services, or Harmony for short, which the NSIDC DAAC has adopted for ICESat-2 data.

“At the NSIDC DAAC, we really work to understand user needs and what is needed to make it easier to access and work with data,” said Amy Steiker, the tools and services portfolio manager for the NSIDC DAAC. “We understand the value that Harmony provides for the ICESat-2 community, so we are doing the work to make these value-added services available.”

Harmony is a framework that serves many NASA Earthdata data sets across NASA DAACs. These data transformation services include spatial, temporal, and variable subsetting, reformatting the data, combining multiple files together in one file, reprojection, and more.

NASA Earthdata Harmony data transformation services, or Harmony for short, allows data users to define their area of interest to filter and crop data. In this example, data from the Soil Moisture Active Passive (SMAP) radiometer is being filtered and subset to the borders of the state of Colorado.
NASA Earthdata Harmony data transformation services, or Harmony for short, allows data users to define their area of interest to filter and crop data. In this example, data from the Soil Moisture Active Passive (SMAP) radiometer is being filtered and subset to the borders of the state of Colorado. — Credit: Agnieszka Gautier, NASA National Snow and Ice Data Center Distributed Active Archive Center

Moving data services over to the NASA Earthdata Cloud

While similar services are currently running on servers on-premises, or on-site, at the NSIDC DAAC, this is the first step in transitioning users toward the NASA Earthdata Cloud option. “Harmony is the solution for our users to continue to request these services in the cloud,” said Steiker.

There will be a period of at least three months where services are available both on-premises and via Harmony, but on-premises services will eventually cease, so users are encouraged to move to Harmony as soon as possible.

For this first release of Harmony services for data managed by the NSIDC DAAC, the DAAC prioritized spatial subsetting and temporal subsetting for ICESat-2 Level 2 and 3A data. These services were found to be the most valuable on-site services at the NSIDC DAAC based on a metrics-based assessment. However, the NSIDC DAAC is actively working to test and validate services for other NASA missions and will continue to add Harmony services for those other missions in the future.

Meeting the needs of data users

But what can Harmony be used for?

“Let’s say the data is not in the format or projection the user is used to working with, or they want to harmonize the data to match the structure, format, or area of interest with another data set,” said Steiker. “This tool is really useful for those types of harmonizations.”

For example, imagine a researcher is looking at changes in the height of a glacier over time. Because they are looking at a specific glacier, they would have a small study area and could be comparing ICESat-2 altimetry data with data that they collect in the field. This researcher could use Harmony to crop the altimetry data so they would only need to download or access data in their interest area as opposed to the entire data file. They could also request data in just the certain timeframe that they are studying.

“ICESat-2 data are big and cumbersome, and the file sizes can be quite large,” said Steiker. “So this really reduces the data volume that is required for data users to work with, and that is a huge benefit. And the workflow is very similar to what people are used to if they are using the on-premises services, so it should be an easy transition.”

Using Harmony for ICESat-2 data

To take advantage of Harmony services, data users have a couple of options. The main option available is to access Harmony through NASA Earthdata Search. Within this tool, users can request data by using a bounding box and submitting a rectangle request, uploading a shape file to specify a spatial outline of interest, or selecting a timeframe of interest. The data will then be processed on the backend and be available as a downloadable file or directly in the cloud for 30 days after the order is submitted. Users can also have data sent directly to their Amazon Simple Storage Service (S3) bucket, or the location where they are working, within AWS.

Another primary option for accessing Harmony is via Harmony-py, a Python library developed by NASA for requesting Harmony services. This library is available for scientists who wish to use Harmony within a Python-based workflow and for machine-to-machine communication with larger Python applications.

To learn more about using Harmony with ICESat-2 data, users should check out the new Python Jupyter tutorial, “Subsetting ICESat-2 Data Using NASA Harmony,” led by Andy Barrett of the NSIDC DAAC, available in the NASA Earthdata Cloud Cookbook.

Data users are also welcome to reach out to the NSIDC User Services Office with any questions.