Interactive Session: Human Anthropogenic Emissions (EPA)

About the Data

The gridded EPA U.S. anthropogenic methane greenhouse gas inventory (gridded GHGI) includes spatially disaggregated (0.1 deg x 0.1 deg or approximately 10 x 10 km resolution) maps of annual anthropogenic methane emissions (for the contiguous United States (CONUS), consistent with national annual U.S. anthropogenic methane emissions reported in the U.S. EPA Inventory of U.S. Greenhouse Gas Emissions and Sinks (U.S. GHGI). This V2 Express Extension dataset contains methane emissions provided as fluxes, in units of molecules of methane per square cm per second, for over 25 individual emission source categories, including those from agriculture, petroleum and natural gas systems, coal mining, and waste. The data have been converted from their original NetCDF format to Cloud-Optimized GeoTIFF (COG) for use in the US GHG Center, thereby enabling user exploration of spatial anthropogenic methane emissions and their trends.

Requirements

  • Set up Python Environment - See setup_instructions.md in the /setup/ folder

Learning Objectives

  • How to use U.S. GHG Center STAC Catalog to access U.S. Gridded Anthropogenic Methane Emissions Inventory data
  • How to visualize a dataset using folium and perform zonal statistics

Approach

  1. Identify available dates and temporal frequency of observations for the given collection using the GHGC API /stac endpoint. The collection processed in this notebook is the gridded methane emissions data product.
  2. Pass the STAC item into the raster API /stac/tilejson.jsonendpoint.
  3. Using folium.plugins.DualMap, we will visualize two tiles (side-by-side), allowing us to compare time points.
  4. After the visualization, we will perform zonal statistics for a given polygon.

Installing the Required Libraries

Required libraries are

# Import required libraries

# import earthaccess
import os
import warnings
import requests
import pandas as pd
os.environ['USE_PYGEOS'] = '0'
import geopandas
import folium
import folium.plugins
import seaborn as sns
import glob
import numpy as np
import matplotlib.pyplot as plt
import branca.colormap as cm

from folium import Map, TileLayer 
from branca.element import Figure
from pystac_client import Client 
from pyproj import Geod
from shapely import wkt
from datetime import datetime

Define the Spatial Region of Interest

For this example, our spatial region of interest (ROI) will be the Permian Basin, USA. The Permian Basin is a regional depression expanding from the eastern side of New Mexico to West Texas, covering roughly 75 thousand square miles of land. This region is known to be “the largest hydrocarbon-producing basins in the United States and the world” according to the U.S. Energy Information Administration. Consequently, the Permian Basin is prone to excessive air pollution and a suitable site for monitoring anthropogenic methane emissions.

# Loading Permian Basin shape file as a ROI
# User can pass any json or shape file here
# The "geopandas" library makes working with geospatial data easier by facilitating spatial operations on geometric types  
data_dir = f"{os.getenv('HOME')}/shared/data"
permian_basin = geopandas.read_file(f'{data_dir}/use-case-3/Permian.zip')

Querying the STAC API

Search for U.S. Gridded Anthropogenic Methane Emissions Inventory Data

# Provide STAC and RASTER API endpoints
STAC_API_URL = "http://ghg.center/api/stac"
RASTER_API_URL = "https://ghg.center/api/raster"

# Please use the collection name similar to the one used in STAC collection.
# Name of the collection for gridded methane dataset. 
collection_name = "epa-ch4emission-yeargrid-v2express"
# Fetching the collection from STAC collections using appropriate endpoint.
collection = requests.get(f"{STAC_API_URL}/collections/{collection_name}").json()
collection

Examining the contents of our collection under the temporal variable, we see that the data is available from January 2012 to December 2020. By looking at the dashboard:time density, we observe that the periodic frequency of these observations is yearly.

# Create a function that would search for the number of items in above data collection in the STAC API
def get_item_count(collection_id):
    count = 0
    items_url = f"{STAC_API_URL}/collections/{collection_id}/items"

    while True:
        response = requests.get(items_url)

        if not response.ok:
            print("error getting items")
            exit()

        stac = response.json()
        count += int(stac["context"].get("returned", 0))
        next = [link for link in stac["links"] if link["rel"] == "next"]

        if not next:
            break
        items_url = next[0]["href"]

    return count
# Check total number of items available
number_of_items = get_item_count(collection_name)
items = requests.get(f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}").json()["features"]
print(f"Found {len(items)} items")
# Examining the first item in the collection
# Keep in mind that a list starts from 0, 1, 2,... therefore ‘[0]’ is referring to the first item in the list/collection 
items[0]

Opening and Exploring Methane Emissions Using the Raster API

In this notebook, we will explore the temporal impacts of methane emissions. We will visualize the outputs on a map using folium.

# To access the year value from each item more easily, this will let us query more explicity by year and month (e.g., 2020-02)
items = {item["properties"]["datetime"][:7]: item for item in items} 

# Set the asset values to the appropriate parameters
asset_name_1 = "production-ngs"
asset_name_2 = 'production-ps'
asset_name_3 = 'total-methane'
# Below, we are entering the minimum and maximum values to provide our upper and lower bounds
rescale_values = {'max': 20,'min': 0}
# Examining the items in the collection
items

Now, we will pass the item id, collection name, and rescaling_factor to the Raster API endpoint. We will do this twice, once for January 2018 and again for January 2012, so that we can visualize each event independently.

# Set a colormap for the granule
# Please refer to matplotlib library if you'd prefer choosing a different color ramp (https://matplotlib.org/stable/users/explain/colors/colormaps.html)
color_map = "rainbow" 

# Petroleum-Production Methane Emissions

# Extract and display the January 2018 tile using the appropriate ID, colormap, rescale values, and datetime (items['2018-01']) 

january_2018_tile_ps = requests.get(
    f"{RASTER_API_URL}/stac/tilejson.json?collection={items['2018-01']['collection']}&item={items['2018-01']['id']}"
    f"&assets={asset_name_1}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}", 
).json()

# Print the information associated with this tile
january_2018_tile_ps

january_2012_tile_ps = requests.get(
    f"{RASTER_API_URL}/stac/tilejson.json?collection={items['2012-01']['collection']}&item={items['2012-01']['id']}"
    f"&assets={asset_name_1}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}", 
).json()
january_2012_tile_ps
# Natural Gas-Production Methane Emissions

color_map = "rainbow" # please select the color ramp from matplotlib library.
january_2018_tile_ngs = requests.get(
    f"{RASTER_API_URL}/stac/tilejson.json?collection={items['2018-01']['collection']}&item={items['2018-01']['id']}"
    f"&assets={asset_name_2}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}", 
).json()
january_2018_tile_ngs

january_2012_tile_ngs = requests.get(
    f"{RASTER_API_URL}/stac/tilejson.json?collection={items['2012-01']['collection']}&item={items['2012-01']['id']}"
    f"&assets={asset_name_2}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}", 
).json()
january_2012_tile_ngs
# Total Methane Emissions

color_map = "rainbow" # please select the color ramp from matplotlib library.
january_2018_tile_tm = requests.get(
    f"{RASTER_API_URL}/stac/tilejson.json?collection={items['2018-01']['collection']}&item={items['2018-01']['id']}"
    f"&assets={asset_name_3}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}", 
).json()
january_2018_tile_tm

january_2012_tile_tm = requests.get(
    f"{RASTER_API_URL}/stac/tilejson.json?collection={items['2012-01']['collection']}&item={items['2012-01']['id']}"
    f"&assets={asset_name_3}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}", 
).json()
january_2012_tile_tm

Visualizing Petroleum- Production (annual) CH₄ emissions

# Set initial zoom and center of map for CH₄ Layer
# Centre of map [latitude,longitude]
map_ = folium.plugins.DualMap(location=(32, -102), zoom_start=6)

# January 2018
map_layer_2018 = TileLayer(
    tiles=january_2018_tile_ps["tiles"][0],
    name = 'Petroleum production for 2018',
    attr="GHG",
    opacity=0.7,
)
map_layer_2018.add_to(map_.m1)

# January 2012
map_layer_2012 = TileLayer(
    tiles=january_2012_tile_ps["tiles"][0],
    name = 'Petroleum production for 2012',
    attr="GHG",
    opacity=0.7,
)
map_layer_2012.add_to(map_.m2)

# Load the GeoJSON file for the Permian Basin
map_layer_permian = folium.GeoJson(permian_basin, name= 'Permian Shape')
map_layer_permian.add_to(map_)

# Adjust map elements
folium.LayerControl(collapsed=True, position='bottomleft').add_to(map_)
# visualising the map
map_

Visualizing Natural Gas - Production (annual) CH₄ emissions

map_ = folium.plugins.DualMap(location=(32, -102), zoom_start=6)

# January 2018
map_layer_2018 = TileLayer(
    tiles=january_2018_tile_ngs["tiles"][0],
    name = 'Natural Gas production for 2018',
    attr="GHG",
    opacity=0.7,
)
map_layer_2018.add_to(map_.m1)

# January 2012
map_layer_2012 = TileLayer(
    tiles=january_2012_tile_ngs["tiles"][0],
    name = 'Natural Gas Production for 2012',
    attr="GHG",
    opacity=0.7,
)
map_layer_2012.add_to(map_.m2)

map_layer_permian = folium.GeoJson(permian_basin, name= 'Permian Shape')
map_layer_permian.add_to(map_)
folium.LayerControl(collapsed=True, position='bottomleft').add_to(map_)

# visualising the map
map_

Visualizing Total CH₄ emissions

# We will import folium to map and folium.plugins to allow side-by-side mapping
import folium
import folium.plugins

# Set initial zoom and center of map for CH₄ Layer
# Centre of map [latitude,longitude]
map_ = folium.plugins.DualMap(location=(32, -102), zoom_start=6)

# January 2018
map_layer_2018 = TileLayer(
    tiles=january_2018_tile_tm["tiles"][0],
    attr="GHG",
    name = 'Total methane emissions for 2018',
    opacity=0.7,
)
map_layer_2018.add_to(map_.m1)

# January 2012
map_layer_2012 = TileLayer(
    tiles=january_2012_tile_tm["tiles"][0],
    name= 'Total methane emissions for 2012',
    attr="GHG",
    opacity=0.7,
)
map_layer_2012.add_to(map_.m2)

map_layer_permian = folium.GeoJson(permian_basin, name= 'Permian Shape')
map_layer_permian.add_to(map_)
folium.LayerControl(collapsed=True, position='bottomleft').add_to(map_)
# visualising the map
map_

Calculating Zonal Statistics

# Check total number of items available
items = requests.get(
    f"{STAC_API_URL}/collections/{collection_name}/items?limit=300"
).json()["features"]
print(f"Found {len(items)} items")
# Explore the first item
items[0]
# The bounding box should be passed to the geojson param as a geojson Feature or FeatureCollection
# The following function generates statistics for the assigned parameter (total methane emissions)

def generate_stats(item, geojson):
    result = requests.post(
        f"{RASTER_API_URL}/cog/statistics",
        params={"url": item["assets"][asset_name_3]["href"]},
        json=geojson,
    ).json()
    return {
        **result["features"][0]["properties"],
        "datetime": item["properties"]["datetime"],
    }

With the function above we can generate the statistics for the AOI.

# We create a area of interest (polygon) on which will be used later 
# https://geojson.io/#map=5.23/31.925/-104.738
permian_basin = {
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "coordinates": [
          [
            [
              -104.6235111371881,
              33.886802812506744
            ],
            [
              -104.6235111371881,
              30.500923965578963
            ],
            [
              -101.35251243241865,
              30.500923965578963
            ],
            [
              -101.35251243241865,
              33.886802812506744
            ],
            [
              -104.6235111371881,
              33.886802812506744
            ]
          ]
        ],
        "type": "Polygon"
      }
    }
  ]
}
%%time
# Now we apply the above function to the area of interest 
stats = [generate_stats(item, permian_basin) for item in items]
stats[0]
# Manipulating and cleaning the stats output from the previous cell

def clean_stats(stats_json) -> pd.DataFrame:
    df = pd.json_normalize(stats_json)
    df.columns = [col.replace("statistics.b1.", "") for col in df.columns]
    df["date"] = pd.to_datetime(df["datetime"])
    return df


df = clean_stats(stats)
df.head(5)

Visualizing the Data as a Time Series

We can now explore the total gridded methane emission time series (January 2000 -December 2021) available for the Permian Basin area of the U.S. We can plot the data set using the code below:

fig = plt.figure(figsize=(20, 10))

# Set the plot elements
plt.plot(
    df["date"],
    df["max"],
    color="red",
    linestyle="-",
    linewidth=0.5,
    label="Max monthly CO₂ emissions",
)

# Set the labels for the X and Y axis and assign a title for the plot 
plt.legend()
plt.xlabel("Years")
plt.ylabel("CH4 emissions Molecules CH₄/cm²/s")
plt.title("Total CH4 gridded methane emission for Permian Basin (2012-2018)")