Utilizing NASA’s EMIT Instrument to Monitor Methane Plumes from Point Source Emitters

Daily aggregated, global point source methane emission plume estimates from the EMIT instrument on the International Space Station (ISS)
Author

Siddharth Chaudhary, Vishal Gaur, Farnaz Bayat

Published

September 20, 2024

Access this Notebook

You can launch this notebook in the US GHG Center JupyterHub by clicking the link below. If you are a new user, you should first sign up for the hub by filling out this request form and providing the required information.

Access the EMIT Methane Point Source Plume Complexes notebook in the US GHG Center JupyterHub.

Table of Contents

Data Summary and Application

  • Spatial coverage: 52°N to 52°S latitude within target mask
  • Spatial resolution: 60 m
  • Temporal extent: August 1, 2022 - Ongoing
  • Temporal resolution: Variable
  • Unit: Parts per million meter (ppm-m)
  • Utility: Methane Emissions, Plume Detection, Climate Monitoring

For more, visit the EMIT Methane Point Source Plume Complexes data overview page.

Approach

  1. Identify available dates and temporal frequency of observations for the given collection using the GHGC API /stac endpoint. The collection processed in this notebook is the Earth Surface Mineral Dust Source Investigation (EMIT) methane emission plumes data product.
  2. Pass the STAC item into the raster API /collections/{collection_id}/items/{item_id}/tilejson.json endpoint.
  3. Using folium.Map, visualize the plumes.
  4. After the visualization, perform zonal statistics for a given polygon.

About the Data

The Earth Surface Mineral Dust Source Investigation (EMIT) instrument builds upon NASA’s long history of developing advanced imaging spectrometers for new science and applications. EMIT launched to the International Space Station (ISS) on July 14, 2022. The data shows high-confidence research grade methane plumes from point source emitters - updated as they are identified - in keeping with Jet Propulsion Laboratory (JPL) Open Science and Open Data policy.

Large methane emissions, typically referred to as point source emissions, represent a significant proportion of total methane emissions from the production, transport, and processing of oil and natural gas, landfills, and other sources. By measuring the spectral fingerprint of methane, EMIT can map areas of high methane concentration over background levels in the atmosphere, identifying plume complexes, and estimating the methane enhancements.

For more information regarding this dataset, please visit the EMIT Methane Point Source Plume Complexes data overview page.

Install the Required Libraries

Required libraries are pre-installed on the GHG Center Hub, except the tabulate and seaborn libraries. If you need to run this notebook elsewhere, please install the libraries by running the following command line:

%pip install requests folium rasterstats pystac_client pandas matplotlib –quiet

Querying the STAC API

First, we are going to import the required libraries. Once imported, they allow better executing a query in the GHG Center Spatio Temporal Asset Catalog (STAC) Application Programming Interface (API) where the granules for this collection are stored.

# Import the following libraries
import requests
import folium
import folium.plugins
from folium import Map, TileLayer
from pystac_client import Client
import branca
import pandas as pd
import matplotlib.pyplot as plt
import branca.colormap as cm
import seaborn as sns
# Provide the STAC and RASTER API endpoints
# The endpoint is referring to a location within the API that executes a request on a data collection nesting on the server.

# The STAC API is a catalog of all the existing data collections that are stored in the GHG Center.
STAC_API_URL = "https://earth.gov/ghgcenter/api/stac"

# The RASTER API is used to fetch collections for visualization
RASTER_API_URL = "https://earth.gov/ghgcenter/api/raster"

# The collection name is used to fetch the dataset from the STAC API. First, we define the collection name as a variable
# Name of the collection for methane emission plumes 
collection_name = "emit-ch4plume-v1"
# Fetch the collection from the STAC API using the appropriate endpoint
# The 'requests' library allows a HTTP request possible
collection = requests.get(f"{STAC_API_URL}/collections/{collection_name}").json()

# Print the properties of the collection to the console
collection

Examining the contents of our collection under the temporal variable, we note that data is available from August 2022 to May 2023. By looking at the dashboard: time density, we can see that observations are conducted daily and non-periodically (i.e., there are plumes emissions for multiple places on the same dates).

def get_item_count(collection_id):
    count = 0
    items_url = f"{STAC_API_URL}/collections/{collection_id}/items"

    while True:
        response = requests.get(items_url)

        if not response.ok:
            print("error getting items")
            exit()

        stac = response.json()
        count += int(stac["context"].get("returned", 0))
        next = [link for link in stac["links"] if link["rel"] == "next"]

        if not next:
            break
        items_url = next[0]["href"]

    return count
# Check total number of items available
number_of_items = get_item_count(collection_name)
items = requests.get(f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}").json()["features"]
print(f"Found {len(items)} items")
# Import the following libraries
import requests
import folium
import folium.plugins
from folium import Map, TileLayer 
from pystac_client import Client 
import branca 
import pandas as pd
import matplotlib.pyplot as plt
from tabulate import tabulate
import branca.colormap as cm
import seaborn as sns

Query the STAC API

First, you need to import the required libraries. Once imported, they allow better execution of a query in the GHG Center Spatio Temporal Asset Catalog (STAC) Application Programming Interface (API) where the granules for this collection are stored. You will learn the functionality of each library throughout the notebook.

# Provide the STAC and RASTER API endpoints
# The endpoint is referring to a location within the API that executes a request on a data collection nesting on the server.

# The STAC API is a catalog of all the existing data collections that are stored in the GHG Center.
STAC_API_URL = "https://earth.gov/ghgcenter/api/stac/"

# The RASTER API is used to fetch collections for visualization
RASTER_API_URL = "https://earth.gov/ghgcenter/api/raster/"

STAC API Collection Names

Now, you must fetch the dataset from the STAC API by defining its associated STAC API collection ID as a variable. The collection ID, also known as the collection name, for the EMIT Methane Point Source Plume Complexes dataset is emit-ch4plume-v1

# The collection name is used to fetch the dataset from the STAC API. First, we define the collection name as a variable
# Name of the collection for methane emission plumes 
collection_name = "emit-ch4plume-v1"


# Fetch the collection from the STAC API using the appropriate endpoint
# The 'requests' library allows a HTTP request possible
collection = requests.get(f"{STAC_API_URL}/collections/{collection_name}").json()

# Print the properties of the collection in a table
# Adjust display settings
pd.set_option('display.max_colwidth', None)  # Set maximum column width to "None" to prevent cutting off text

# Extract the relevant information about the collection
collection_info = {
    "Title": collection.get("title", "N/A"), # Extract the title of the collection 
    "Description": collection.get("description", "N/A"), # Extract the dataset description
    "Temporal Extent": collection.get("extent", {}).get("temporal", {}).get("interval", "N/A"), # Extract the temporal coverage of the collection
    "Spatial Extent": collection.get("extent", {}).get("spatial", {}).get("bbox", "N/A"), # Extract the spatial coverage of the collection
}

# Convert the derived information into a DataFrame format
properties_table = pd.DataFrame(list(collection_info.items()), columns=["Collection Summary", ""])

# Display the properties in a table
collection_summary = properties_table.style.set_properties(**{'text-align': 'left'}) \
                                           .set_table_styles([
    {
        'selector': 'th.col0, td.col0',    # Select the first column
        'props': [('min-width', '200px'),  # Set a minimum width
                  ('text-align', 'left')]  # Align text to the left
    },
    {
        'selector': 'td.col1',             # Select the second column
        'props': [('text-align', 'left')]  # Align text to the left
    }
])

# Print the collection summary table
collection_summary

Next, you will examine the contents of the collection under the temporal variable. You’ll see that the data is available since August 2022. Looking at the dashboard: time density, you can see that observations are conducted daily and non-periodically (i.e., there are plumes emissions for multiple places on the same dates).

# Create a function that would search for data collection in the US GHG Center STAC API

# First, we need to define the function
# The name of the function is "get_item_count" 
# The argument that will be passed to the defined function is "collection_id"
def get_item_count(collection_id):

    # Set a counter for the number of items existing in the collection 
    count = 0 

    # Define the path to retrieve the granules (items) of the collection of interest in the STAC API
    items_url = f"{STAC_API_URL}/collections/{collection_id}/items" 

    # Run a while loop to make HTTP requests until there are no more URLs associated with the collection in the STAC API
    while True:

        # Retrieve information about the granules by sending a "get" request to the STAC API using the defined collection path 
        response = requests.get(items_url) 

        # If the items do not exist, print an error message and quit the loop
        if not response.ok:
            print("error getting items")
            exit()

        # Return the results of the HTTP response as JSON
        stac = response.json()

        # Increase the "count" by the number of items (granules) returned in the response
        count += int(stac["context"].get("returned", 0))

        # Retrieve information about the next URL associated with the collection in the STAC API (if applicable)
        next = [link for link in stac["links"] if link["rel"] == "next"]

        # Exit the loop if there are no other URLs
        if not next:
            break
        
        # Ensure the information gathered by other STAC API links associated with the collection are added to the original path
        # "href" is the identifier for each of the tiles stored in the STAC API
        items_url = next[0]["href"]

    # Return the information about the total number of granules found associated with the collection
    return count
# Apply the function created above "get_item_count" to the collection
number_of_items = get_item_count(collection_name)

# Get the information about the number of granules found in the collection
items = requests.get(f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}").json()["features"]

# Print the total number of items (granules) found
print(f"Found {len(items)} observations")

# Sort the items based on their date-time attribute
items_sorted = sorted(items, key=lambda x: x["properties"]["datetime"])

# Create an empty list
table_data = []
# Extract the ID and date-time information for each granule and add them to the list
# By default, only the first 5 items in the collection are extracted to be displayed in the table. 
# To see the date-time of all existing granules in this collection, remove "5" from "item_sorted[:5]" in the line below. 
for item in items_sorted[:5]:
    table_data.append([item['id'], item['properties']['datetime']])

# Define the table headers
headers = ["Item ID", "Date-Time"]

print("Below you see the first 5 items in the collection, along with their item IDs and corresponding Start Date-Time.")

# Print the table using tabulate
print(tabulate(table_data, headers=headers, tablefmt="fancy_grid"))
# Examine the first item in the collection
# Keep in mind that a list starts from 0, 1, 2... therefore items[0] refers to the first item (granule) in the list/collection
items_sorted[0]

Map Out Selected Tiles

You will now explore global methane emission plumes from point sources and visualize the results on a map using folium.

# Once again, apply the function created above "get_item_count" to the Air-Sea CO2 Flux ECCO-Darwin collection
# This step allows retrieving the number of granules “observations” in the collection.
number_of_items = get_item_count(collection_name)
items = requests.get(f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}").json()["features"]


# Next, you need to create a dictionary where the "id" field of each item in the collection are queried more explicitly
plume_complexes = {items["id"]: items for items in items} 


# Next, you need to specify the asset name for this collection.
# The asset name refers to the raster band containing the pixel values for the parameter of interest.
# For the case of the EMIT Methane Point Source collection, the parameter of interest is “ch4-plume-emissions”.
asset_name = "ch4-plume-emissions"

Below, you will enter the minimum and maximum values to provide our upper and lower bounds in the rescale_values.

# Fetching the min and max values for a specific item
rescale_values = {"max":plume_complexes[list(plume_complexes.keys())[0]]["assets"][asset_name]["raster:bands"][0]["histogram"]["max"], "min":plume_complexes[list(plume_complexes.keys())[0]]["assets"][asset_name]["raster:bands"][0]["histogram"]["min"]}

Now, you will pass the item id, collection name, asset name, and the rescaling factor to the Raster API endpoint.

# Select the item ID which you want to visualize. Item ID is in the format yyyymmdd followed by the timestamp. This ID can be extracted from the COG name as well.
# To browse and select other tiles in the collection, please visit https://search.earthdata.nasa.gov/search/granules?p=C2748088093-LPCLOUD&pg[0][v]=f&pg[0][gsk]=-start_date&q=emit%20plume&tl=1694622854.77!3!!

# You need to copy the entire granule nomenclature 
item_id = "EMIT_L2B_CH4PLM_001_20230418T200118_000829"

# Choose a color map for displaying the first observation (event)
# Please refer to matplotlib library if you'd prefer to choose a different color ramp.
# For more information on Colormaps in Matplotlib, please visit https://matplotlib.org/stable/users/explain/colors/colormaps.html
color_map = "magma"

# Make a GET request to retrieve information for the selected tile defined in "item_id"
methane_plume_tile = requests.get(
    f"{RASTER_API_URL}/collections/{plume_complexes[item_id]['collection']}/items/{plume_complexes[item_id]['id']}/tilejson.json?"
    f"&assets={asset_name}"
    
    # Pass the color formula and colormap for custom visualization
    f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
    
    # Pass the minimum and maximum values for rescaling 
    f"&rescale={rescale_values['min']},{rescale_values['max']}", 
    
# Return the response in JSON format
).json()

# Print the properties of the retrieved granule to the console
methane_plume_tile
# Set a colormap for the granule
# Please refer to matplotlib library if you'd prefer choosing a different color ramp (https://matplotlib.org/stable/users/explain/colors/colormaps.html)
colormap = "magma" 


# Defining the breaks in the colormap 
color_map = cm.LinearColormap(colors = ['#310597', '#4C02A1', '#6600A7', '#7E03A8', '#9511A1', '#AA2395', '#BC3587', '#CC4778', '#DA5A6A', '#E66C5C', '#F0804E', '#F89540','#FDAC33', '#FDC527', '#F8DF25'], vmin = 0, vmax = 1500 )


# Add an appropriate caption, in this case it would be Parts per million meter
color_map.caption = 'ppm-m'

# Set initial zoom and center of map for plume Layer
map_ = folium.Map(location=(methane_plume_tile["center"][1], methane_plume_tile["center"][0]), zoom_start=14, tiles=None, tooltip = 'test tool tip')
folium.TileLayer(tiles='https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}.png', name='ESRI World Imagery', attr='Tiles © Esri — Source: Esri, i-cubed, USDA, USGS, AEX, GeoEye, Getmapping, Aerogrid, IGN, IGP, UPR-EGP, and the GIS User Community',overlay='True').add_to(map_)


# Use the 'TileLayer' library to display the raster layer, add an appropriate caption, and adjust the transparency of the layer on the map
map_layer = TileLayer(
    tiles=methane_plume_tile["tiles"][0], # Path to retrieve the tile
    name='Plume Complex Landfill',
    overlay='True', # The layer can be overlaid on the map
    attr="GHG", # Set the attribution 
    opacity=1, # Adjust the transparency of the layer
)
map_layer.add_to(map_)


# Adjust map elements 
folium.LayerControl(collapsed=False, position='bottomleft').add_to(map_)
map_.add_child(color_map)
svg_style = '<style>svg#legend {font-size: 14px; background-color: white;}</style>'
map_.get_root().header.add_child(folium.Element(svg_style))


# Visualizing the map
map_

Calculate Zonal Statistics

To perform zonal statistics, you first need to create a polygon. In this use case, you are creating a polygon using the plume’s retrieved coordinates.

# Access the coordinates of the plume feature (using the 'item_id')
plumes_coordinates = plume_complexes[item_id]["geometry"]["coordinates"]

# Create a polygon for the area of interest (aoi)
methane_plume_aoi = {
    "type": "Feature", # Create a feature object
    "properties": {},
    "geometry": {      # The geometry of the feature
        "coordinates":
            plumes_coordinates,  # Use the plume's coordinates retrieved earlier
        "type": "Polygon",
    },
}
# Please put the name of the place you are trying to visualize. 
# The granule that was selected by default is showuing plumes near Denver, United States. 
region_name = "Place_Holder"


# Create a new map to display the generated polygon
aoi_map = Map(
    
    # Base map is set to OpenStreetMap
    tiles="OpenStreetMap",
    
    # Define the spatial properties for the map
    location=[
        plumes_coordinates[0][0][1],
        plumes_coordinates[0][0][0]
    ],
    
    # Set the center of the map
    zoom_start=12,
)

# Insert the polygon to the map
folium.GeoJson(methane_plume_aoi, name=region_name).add_to(aoi_map)


# Visualize the map
aoi_map
# Check the total number of items available within the collection
items = requests.get(
    f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}"
).json()["features"]

# Print the total number of items (granules) found
print(f"Found {len(items)} items")
# Examine the first item in the collection
# Keep in mind that a list starts from 0, 1, 2... therefore items[0] refers to the first item (granule) in the list/collection
items[0]
# The bounding box should be passed to the geojson param as a geojson Feature or FeatureCollection
# Create a function that retrieves information regarding a specific granule using its asset name and raster identifier and generates the statistics for it

# The function takes an item (granule) and a JSON (polygon) as input parameters
def generate_stats(item, geojson):
    
    # A POST request is made to submit the data associated with the item of interest (specific observation) within the boundaries of the polygon to compute its statistics
    result = requests.post(
        
        # Raster API Endpoint for computing statistics
        f"{RASTER_API_URL}/cog/statistics",
        
        # Pass the URL to the item, asset name, and raster identifier as parameters
        params={"url": item["assets"][asset_name]["href"]},
        
        # Send the GeoJSON object (polygon) along with the request
        json=geojson,
        
    # Return the response in JSON format
    ).json()
    
    # Print the result
    print(result)
    
    # Return a dictionary containing the computed statistics along with the item's datetime information.
    return {
        **result["properties"],
        "item_id": item["id"][20:],
    }
# Generate a for loop that iterates over all the existing items in the collection 
for item in items:
    
    # The loop will then retrieve the information for the start datetime of each item in the list
    #print(item["id"])
    print(item["properties"]["datetime"])
    
    # Exit the loop after printing the start datetime for the first item in the collection
    break

With the function above, we can generate the statistics for the area of interest.

%%time
# %%time = Wall time (execution time) for running the code below

# Generate statistics using the created function "generate_stats" within the bounding box defined by the aoi polygon
stats = [generate_stats(item, methane_plume_aoi) for item in items]
stats = [ stat for stat in stats if stat["statistics"]["b1"]["mean"] != None]
stats[0]
def clean_stats(stats_json) -> pd.DataFrame:
    df = pd.json_normalize(stats_json)
    df.columns = [col.replace("statistics.b1.", "") for col in df.columns]
    # df["date"] = pd.to_datetime(df["datetime"])
    return df


df = clean_stats(stats)
df
plume_tile_2 = requests.get(
    f"{RASTER_API_URL}/collections/{items[0]['collection']}/items/{items[0]['id']}/tilejson.json?"
    f"&assets={asset_name}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}",
).json()
plume_tile_2
# Use bbox initial zoom and map
# Set up a map located w/in event bounds
plume_tile_2_coordinates = items[0]["geometry"]["coordinates"]
aoi_map_bbox = Map(
    tiles="OpenStreetMap",
    location=[
        plume_tile_2_coordinates[0][0][1],
        plume_tile_2_coordinates[0][0][0]
    ],
    zoom_start=10,
)

map_layer = TileLayer(
    tiles=plume_tile_2["tiles"][0],
    attr="GHG", opacity = 1
)

map_layer.add_to(aoi_map_bbox)

aoi_map_bbox

Summary

In this notebook we have successfully completed the following steps for the STAC collection for the EMIT Methane Point Source Plume Complexes dataset: 1. Install and import the necessary libraries 2. Fetch the collection from STAC collections using the appropriate endpoints 3. Count the number of existing granules within the collection 4. Map the methane emission plumes 5. Generate statistics for the area of interest (AOI)

If you have any questions regarding this user notebook, please contact us using the feedback form.

Back to top