GRA²PES Greenhouse Gas and Air Quality Species

Monthly, 0.036 degree resolution emissions of carbon dioxide (CO₂), carbon monoxide (CO), nitrogen oxide (NOₓ), sulfur dioxide (SO₂), and particulate matter (PM2.5) emissions for the year 2021 over the Contiguous United States from the Greenhouse gas And Air Pollutants Emissions System (GRA²PES)
Author

Siddharth Chaudhary, Paridhi Parajuli

Published

August 30, 2024

Access this Notebook

You can launch this notebook in the US GHG Center JupyterHub by clicking the link below. If you are a new user, you should first sign up for the hub by filling out this request form and providing the required information.

Access the GRA²PES Greenhouse Gas and Air Quality Species notebook in the US GHG Center JupyterHub.

Table of Contents

Data Summary and Application

  • Spatial coverage: Contiguous United States
  • Spatial resolution: 0.036° x 0.036°
  • Temporal extent: January 2021 - December 2021
  • Temporal resolution: Monthly
  • Unit: Metric tons per kilometer squared per month (tonne/km²/month) for carbon dioxide (CO₂), carbon monoxide (CO), nitrogen oxides (NOₓ), sulfur dioxide (SO₂), and particulate matter (PM2.5)
  • Utility: Climate Research

For more, visit the GRA²PES Greenhouse Gas and Air Quality Species data overview page.

Approach

  1. Identify available dates and temporal frequency of observations for the given collection using the GHGC API /stac endpoint. The collection processed in this notebook is the Vulcan Fossil Fuel CO₂ Emissions Data product.
  2. Pass the STAC item into the raster API /stac/tilejson.jsonendpoint.
  3. Using folium.plugins.DualMap, we will visualize two tiles (side-by-side), allowing us to compare time points.
  4. After the visualization, we will perform zonal statistics for a given polygon.

About the Data

GRA2PES Greenhouse Gas and Air Quality Species

The Greenhouse gas And Air Pollutants Emissions System (GRA2PES) dataset at the GHG Center is an aggregated, regridded, monthly high-resolution (0.036 x 0.036°) data product with emissions of both greenhouse gases and air pollutants developed in a consistent framework. The dataset contains emissions over the contiguous United States covering major anthropogenic sectors, including energy, industrial fuel combustion and processes, commercial and residential combustion, oil and gas production, on-road and off-road transportation, etc. (see Table 1 in the Scientific Details section below for a full sector list). Fossil fuel CO2 (ffCO2) emissions are developed along with those of air pollutants including CO, NOx, SOx, and PM2.5 with consistency in spatial and temporal distributions. Emissions by sectors are grouped into point and area sources, reported as column totals in units of metric tons per km2 per month. Spatial-temporal surrogates are developed to distribute CO2 emissions to grid cells to keep consistency between greenhouse gases and air quality species. The current version of GRA2PES is for 2021. Long-term emissions and more greenhouse gas species (e.g., methane) are under development and will be added in the future.

For more information regarding this dataset, please visit the GRA2PES Greenhouse Gas and Air Quality Species, Version 1 data overview page.

Terminology

Navigating data via the GHGC API, you will encounter terminology that is different from browsing in a typical filesystem. We’ll define some terms here which are used throughout this notebook. - catalog: All datasets available at the /stac endpoint - collection: A specific dataset, e.g. GRA2PES - item: One granule in the dataset, e.g. one monthly file of greenhouse gas emissions - asset: A variable available within the granule, e.g. CO, CO2, or NOx emissions - STAC API: SpatioTemporal Asset Catalogs - Endpoint for fetching metadata about available datasets - Raster API: Endpoint for fetching data itself, for imagery and statistics

Install the Required Libraries

Required libraries are pre-installed on the GHG Center Hub. If you need to run this notebook elsewhere, please install them with this line in a code cell:

%pip install requests folium rasterstats pystac_client pandas matplotlib –quiet

# Import the following libraries
# For fetching from the Raster API
import requests
# For making maps
import folium
import folium.plugins
from folium import Map, TileLayer
# For talking to the STAC API
from pystac_client import Client
# For working with data
import pandas as pd
# For making time series
import matplotlib.pyplot as plt
# For formatting date/time data
import datetime
# Custom functions for working with GHGC data via the API
import ghgc_utils

Query the STAC API

STAC API Collection Names

Now, you must fetch the dataset from the STAC API by defining its associated STAC API collection ID as a variable. The collection ID, also known as the collection name, for the GRA2PES Greenhouse Gas and Air Quality Species dataset is gra2pes-ghg-monthgrid-v1.*

**You can find the collection name of any dataset on the GHGC data portal by navigating to the dataset landing page within the data catalog. The collection name is the last portion of the dataset landing page’s URL, and is also listed in the pop-up box after clicking “ACCESS DATA.”*

# Provide STAC and RASTER API endpoints
STAC_API_URL = "https://earth.gov/ghgcenter/api/stac"
RASTER_API_URL = "https://earth.gov/ghgcenter/api/raster"

# Please use the collection name similar to the one used in the STAC collection.
# Name of the collection for Vulcan Fossil Fuel CO₂ Emissions, Version 4. 
collection_name = "gra2pes-ghg-monthgrid-v1"
# Using PySTAC client
# Fetch the collection from the STAC API using the appropriate endpoint
# The 'pystac' library enables an HTTP request
catalog = Client.open(STAC_API_URL)
collection = catalog.get_collection(collection_name)

# Print the properties of the collection to the console
collection

Examining the contents of our collection under the temporal variable, we see that the data is available from January 2010 to December 2021. By looking at the dashboard:time density, we observe that the data is periodic with monthly time density.

items = list(collection.get_items())  # Convert the iterator to a list
print(f"Found {len(items)} items")
Found 12 items
# Examine the first item in the collection
# Keep in mind that a list starts from 0, 1, 2... therefore items[0] is referring to the first item in the list/collection
items[0]
# Restructure our items into a dictionary where keys are the datetime items
# Then we can query more easily by date/time, e.g. "2020-02"
items_dict = {item.properties["start_datetime"][:7]: item for item in collection.get_items()}
# Before we go further, let's pick which asset to focus on for the remainder of the notebook.
# For now, we'll look at:
asset_name = "co2"

Creating Maps Using Folium

You will now explore changes in CO2 emissions for two different dates/times. You will visualize the outputs on a map using folium.

Fetch Imagery Using Raster API

Here we get information from the Raster API which we will add to our map in the next section.

# Specify two date/times that you would like to visualize, using the format of items_dict.keys()
dates = ["2021-01","2021-07"]

Below, we use some statistics of the raster data to set upper and lower limits for our color bar. These are saved as the rescale_values, and will be passed to the Raster API in the following step(s).

# Extract collection name and item ID for the first date
first_date = items_dict[dates[0]]
collection_id = first_date.collection_id
item_id = first_date.id
# Select relevant asset (microbial CH4 emissions)
object = first_date.assets[asset_name]
raster_bands = object.extra_fields.get("raster:bands", [{}])
# Print raster bands' information
print(raster_bands)
[{'scale': 1.0, 'nodata': -9999.0, 'offset': 0.0, 'sampling': 'area', 'data_type': 'float32', 'histogram': {'max': 27284.111328125, 'min': 0.0, 'count': 11, 'buckets': [334564, 276, 81, 37, 17, 7, 2, 6, 3, 1]}, 'statistics': {'mean': 27.047150098210714, 'stddev': 271.37551127034317, 'maximum': 27284.111328125, 'minimum': 0.0, 'valid_percent': 73.84708309819413}}]
# Use statistics to generate appropriate colorbar range
rescale_values = {
    "max": raster_bands[0]["statistics"]["mean"] + 2.5*raster_bands[0]["statistics"]["stddev"],
    "min": raster_bands[0]["statistics"]["minimum"],
}

print(rescale_values)
{'max': 705.4859282740687, 'min': 0.0}

Now, you will pass the item id, collection name, asset name, and the rescale values to the Raster API endpoint, along with a colormap. This step is done twice, one for each date/time you will visualize, and tells the Raster API which collection, item, and asset you want to view, specifying the colormap and colorbar ranges to use for visualization. The API returns a JSON with information about the requested image. Each image will be referred to as a tile.

# Choose a colormap for displaying the data
# Make sure to capitalize per Matplotlib conventions
# For more information on Colormaps in Matplotlib, please visit https://matplotlib.org/stable/users/explain/colors/colormaps.html
color_map = "Spectral_r" 
# Make a GET request to retrieve information for the date mentioned below
date_1_tile = requests.get(
    f"{RASTER_API_URL}/collections/{collection_id}/items/{item_id}/tilejson.json?"
    f"&assets={asset_name}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map.lower()}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}"
).json()

# Print the properties of the retrieved granule to the console
date_1_tile
{'tilejson': '2.2.0',
 'version': '1.0.0',
 'scheme': 'xyz',
 'tiles': ['https://earth.gov/ghgcenter/api/raster/collections/gra2pes-ghg-monthgrid-v1/items/gra2pes-ghg-monthgrid-v1-202101/tiles/WebMercatorQuad/{z}/{x}/{y}@1x?assets=co2&color_formula=gamma+r+1.05&colormap_name=spectral_r&rescale=0.0%2C705.4859282740687'],
 'minzoom': 0,
 'maxzoom': 24,
 'bounds': [-137.3143, 18.173376, -58.58229999999702, 52.229376000001295],
 'center': [-97.94829999999851, 35.20137600000065, 0]}
# Repeat the above for your second date/time
# Note that we do not calculate new rescale_values for this tile
# because we want date tiles 1 and 2 to have the same colorbar range for visual comparison.
second_date = items_dict[dates[1]]

# Extract collection name and item ID
collection_id = second_date.collection_id
item_id = second_date.id

object = second_date.assets[asset_name]
raster_bands = object.extra_fields.get("raster:bands", [{}])
rescale_values = {
    "max": raster_bands[0].get("histogram", {}).get("max"),
    "min": raster_bands[0].get("histogram", {}).get("min"),
}

print(rescale_values)

date_2_tile = requests.get(
    f"{RASTER_API_URL}/collections/{collection_id}/items/{item_id}/tilejson.json?"
    f"&assets={asset_name}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map.lower()}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}"
).json()

# Print the properties of the retrieved granule to the console
date_2_tile
{'max': 31301.15625, 'min': 0.0}
{'tilejson': '2.2.0',
 'version': '1.0.0',
 'scheme': 'xyz',
 'tiles': ['https://earth.gov/ghgcenter/api/raster/collections/gra2pes-ghg-monthgrid-v1/items/gra2pes-ghg-monthgrid-v1-202107/tiles/WebMercatorQuad/{z}/{x}/{y}@1x?assets=co2&color_formula=gamma+r+1.05&colormap_name=spectral_r&rescale=0.0%2C31301.15625'],
 'minzoom': 0,
 'maxzoom': 24,
 'bounds': [-137.3143, 18.173376, -58.58229999999702, 52.229376000001295],
 'center': [-97.94829999999851, 35.20137600000065, 0]}

Generate Map

# Initialize the map, specifying the center of the map and the starting zoom level.
# 'folium.plugins' allows mapping side-by-side via 'DualMap'
# Map is centered on the position specified by "location=(lat,lon)"
map_ = folium.plugins.DualMap(location=(34, -118), zoom_start=6)

# Define the first map layer (January 2020)
map_layer_1 = TileLayer(
    tiles=date_1_tile["tiles"][0], # Path to retrieve the tile
    attr="GHG", # Set the attribution
    opacity=0.8, # Adjust the transparency of the layer
    name=f"GRA2PES, {dates[0]}",
    overlay=True
)

# Add the first layer to the Dual Map
map_layer_1.add_to(map_.m1)

# Define the second map layer (January 2000)
map_layer_2 = TileLayer(
    tiles=date_2_tile["tiles"][0], # Path to retrieve the tile
    attr="GHG", # Set the attribution
    opacity=0.8, # Adjust the transparency of the layer
    name=f"GRA2PES, {dates[1]}",
    overlay=True
)

# Add the second layer to the Dual Map
map_layer_2.add_to(map_.m2)

# Add a layer control to switch between map layers
folium.LayerControl(collapsed=False).add_to(map_)

# Add colorbar
# First we'll rescale our data to make nicer labels
re_rescale_values = {
    "min": rescale_values["min"]/1e4,
    "max": rescale_values["max"]/1e4
}
# We can use 'generate_html_colorbar' from the 'ghgc_utils' module 
# to create an HTML colorbar representation.
legend_html = ghgc_utils.generate_html_colorbar(
    color_map,
    re_rescale_values,
    label=f'{items[0].assets[asset_name].title} (10^4 tonne/km2/month)'
)

# Add colorbar to the map
map_.get_root().html.add_child(folium.Element(legend_html))

# Visualize the Dual Map
map_
Make this Notebook Trusted to load map: File -> Trust Notebook

Observe higher CO2 emissions in January than in July in urban areas. Which sectors do you think might contribute to this seasonal difference?

Summary

In this notebook we have successfully explored, analyzed, and visualized the STAC collection for GRA2PES greenhouse gases Emissions, Version 1 dataset.

  1. Install and import the necessary libraries
  2. Fetch the collection from STAC collections using the appropriate endpoints
  3. Count the number of existing granules within the collection
  4. Map and compare the total CO₂ emissions for two distinctive months/years

If you have any questions regarding this user notebook, please contact us using the feedback form.

Back to top