# Import the following libraries
import requests
import folium
import folium.plugins
from folium import Map, TileLayer
from pystac_client import Client
import branca
import pandas as pd
import matplotlib.pyplot as plt
import branca.colormap as cm
import seaborn as sns
Utilizing NASA’s EMIT Instrument to Monitor Methane Plumes from Point Source Emitters
Access this Notebook
You can launch this notebook in the US GHG Center JupyterHub by clicking the link below. If you are a new user, you should first sign up for the hub by filling out this request form and providing the required information.
Access the EMIT Methane Point Source Plume Complexes notebook in the US GHG Center JupyterHub.
Table of Contents
Data Summary and Application
- Spatial coverage: 52°N to 52°S latitude within target mask
- Spatial resolution: 60 m
- Temporal extent: August 1, 2022 - Ongoing
- Temporal resolution: Variable
- Unit: Parts per million meter (ppm-m)
- Utility: Methane Emissions, Plume Detection, Climate Monitoring
For more, visit the EMIT Methane Point Source Plume Complexes data overview page.
Approach
- Identify available dates and temporal frequency of observations for the given collection using the GHGC API
/stac
endpoint. The collection processed in this notebook is the Earth Surface Mineral Dust Source Investigation (EMIT) methane emission plumes data product. - Pass the STAC item into the raster API
/collections/{collection_id}/items/{item_id}/tilejson.json
endpoint. - Using
folium.Map
, visualize the plumes. - After the visualization, perform zonal statistics for a given polygon.
About the Data
The Earth Surface Mineral Dust Source Investigation (EMIT) instrument builds upon NASA’s long history of developing advanced imaging spectrometers for new science and applications. EMIT launched to the International Space Station (ISS) on July 14, 2022. The data shows high-confidence research grade methane plumes from point source emitters - updated as they are identified - in keeping with Jet Propulsion Laboratory (JPL) Open Science and Open Data policy.
Large methane emissions, typically referred to as point source emissions, represent a significant proportion of total methane emissions from the production, transport, and processing of oil and natural gas, landfills, and other sources. By measuring the spectral fingerprint of methane, EMIT can map areas of high methane concentration over background levels in the atmosphere, identifying plume complexes, and estimating the methane enhancements.
For more information regarding this dataset, please visit the EMIT Methane Point Source Plume Complexes data overview page.
Install the Required Libraries
Required libraries are pre-installed on the GHG Center Hub, except the tabulate
and seaborn
libraries. If you need to run this notebook elsewhere, please install the libraries by running the following command line:
%pip install requests folium rasterstats pystac_client pandas matplotlib –quiet
Querying the STAC API
First, we are going to import the required libraries. Once imported, they allow better executing a query in the GHG Center Spatio Temporal Asset Catalog (STAC) Application Programming Interface (API) where the granules for this collection are stored.
# Provide the STAC and RASTER API endpoints
# The endpoint is referring to a location within the API that executes a request on a data collection nesting on the server.
# The STAC API is a catalog of all the existing data collections that are stored in the GHG Center.
= "https://earth.gov/ghgcenter/api/stac"
STAC_API_URL
# The RASTER API is used to fetch collections for visualization
= "https://earth.gov/ghgcenter/api/raster"
RASTER_API_URL
# The collection name is used to fetch the dataset from the STAC API. First, we define the collection name as a variable
# Name of the collection for methane emission plumes
= "emit-ch4plume-v1" collection_name
# Fetch the collection from the STAC API using the appropriate endpoint
# The 'requests' library allows a HTTP request possible
= requests.get(f"{STAC_API_URL}/collections/{collection_name}").json()
collection
# Print the properties of the collection to the console
collection
Examining the contents of our collection
under the temporal
variable, we note that data is available from August 2022 to May 2023. By looking at the dashboard: time density
, we can see that observations are conducted daily and non-periodically (i.e., there are plumes emissions for multiple places on the same dates).
def get_item_count(collection_id):
= 0
count = f"{STAC_API_URL}/collections/{collection_id}/items"
items_url
while True:
= requests.get(items_url)
response
if not response.ok:
print("error getting items")
exit()
= response.json()
stac += int(stac["context"].get("returned", 0))
count next = [link for link in stac["links"] if link["rel"] == "next"]
if not next:
break
= next[0]["href"]
items_url
return count
# Check total number of items available
= get_item_count(collection_name)
number_of_items = requests.get(f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}").json()["features"]
items print(f"Found {len(items)} items")
# Import the following libraries
import requests
import folium
import folium.plugins
from folium import Map, TileLayer
from pystac_client import Client
import branca
import pandas as pd
import matplotlib.pyplot as plt
from tabulate import tabulate
import branca.colormap as cm
import seaborn as sns
Query the STAC API
First, you need to import the required libraries. Once imported, they allow better execution of a query in the GHG Center Spatio Temporal Asset Catalog (STAC) Application Programming Interface (API) where the granules for this collection are stored. You will learn the functionality of each library throughout the notebook.
# Provide the STAC and RASTER API endpoints
# The endpoint is referring to a location within the API that executes a request on a data collection nesting on the server.
# The STAC API is a catalog of all the existing data collections that are stored in the GHG Center.
= "https://earth.gov/ghgcenter/api/stac/"
STAC_API_URL
# The RASTER API is used to fetch collections for visualization
= "https://earth.gov/ghgcenter/api/raster/" RASTER_API_URL
STAC API Collection Names
Now, you must fetch the dataset from the STAC API by defining its associated STAC API collection ID as a variable. The collection ID, also known as the collection name, for the EMIT Methane Point Source Plume Complexes dataset is emit-ch4plume-v1
# The collection name is used to fetch the dataset from the STAC API. First, we define the collection name as a variable
# Name of the collection for methane emission plumes
= "emit-ch4plume-v1"
collection_name
# Fetch the collection from the STAC API using the appropriate endpoint
# The 'requests' library allows a HTTP request possible
= requests.get(f"{STAC_API_URL}/collections/{collection_name}").json()
collection
# Print the properties of the collection in a table
# Adjust display settings
'display.max_colwidth', None) # Set maximum column width to "None" to prevent cutting off text
pd.set_option(
# Extract the relevant information about the collection
= {
collection_info "Title": collection.get("title", "N/A"), # Extract the title of the collection
"Description": collection.get("description", "N/A"), # Extract the dataset description
"Temporal Extent": collection.get("extent", {}).get("temporal", {}).get("interval", "N/A"), # Extract the temporal coverage of the collection
"Spatial Extent": collection.get("extent", {}).get("spatial", {}).get("bbox", "N/A"), # Extract the spatial coverage of the collection
}
# Convert the derived information into a DataFrame format
= pd.DataFrame(list(collection_info.items()), columns=["Collection Summary", ""])
properties_table
# Display the properties in a table
= properties_table.style.set_properties(**{'text-align': 'left'}) \
collection_summary
.set_table_styles([
{'selector': 'th.col0, td.col0', # Select the first column
'props': [('min-width', '200px'), # Set a minimum width
'text-align', 'left')] # Align text to the left
(
},
{'selector': 'td.col1', # Select the second column
'props': [('text-align', 'left')] # Align text to the left
}
])
# Print the collection summary table
collection_summary
Next, you will examine the contents of the collection
under the temporal
variable. You’ll see that the data is available since August 2022. Looking at the dashboard: time density
, you can see that observations are conducted daily and non-periodically (i.e., there are plumes emissions for multiple places on the same dates).
# Create a function that would search for data collection in the US GHG Center STAC API
# First, we need to define the function
# The name of the function is "get_item_count"
# The argument that will be passed to the defined function is "collection_id"
def get_item_count(collection_id):
# Set a counter for the number of items existing in the collection
= 0
count
# Define the path to retrieve the granules (items) of the collection of interest in the STAC API
= f"{STAC_API_URL}/collections/{collection_id}/items"
items_url
# Run a while loop to make HTTP requests until there are no more URLs associated with the collection in the STAC API
while True:
# Retrieve information about the granules by sending a "get" request to the STAC API using the defined collection path
= requests.get(items_url)
response
# If the items do not exist, print an error message and quit the loop
if not response.ok:
print("error getting items")
exit()
# Return the results of the HTTP response as JSON
= response.json()
stac
# Increase the "count" by the number of items (granules) returned in the response
+= int(stac["context"].get("returned", 0))
count
# Retrieve information about the next URL associated with the collection in the STAC API (if applicable)
next = [link for link in stac["links"] if link["rel"] == "next"]
# Exit the loop if there are no other URLs
if not next:
break
# Ensure the information gathered by other STAC API links associated with the collection are added to the original path
# "href" is the identifier for each of the tiles stored in the STAC API
= next[0]["href"]
items_url
# Return the information about the total number of granules found associated with the collection
return count
# Apply the function created above "get_item_count" to the collection
= get_item_count(collection_name)
number_of_items
# Get the information about the number of granules found in the collection
= requests.get(f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}").json()["features"]
items
# Print the total number of items (granules) found
print(f"Found {len(items)} observations")
# Sort the items based on their date-time attribute
= sorted(items, key=lambda x: x["properties"]["datetime"])
items_sorted
# Create an empty list
= []
table_data # Extract the ID and date-time information for each granule and add them to the list
# By default, only the first 5 items in the collection are extracted to be displayed in the table.
# To see the date-time of all existing granules in this collection, remove "5" from "item_sorted[:5]" in the line below.
for item in items_sorted[:5]:
'id'], item['properties']['datetime']])
table_data.append([item[
# Define the table headers
= ["Item ID", "Date-Time"]
headers
print("Below you see the first 5 items in the collection, along with their item IDs and corresponding Start Date-Time.")
# Print the table using tabulate
print(tabulate(table_data, headers=headers, tablefmt="fancy_grid"))
# Examine the first item in the collection
# Keep in mind that a list starts from 0, 1, 2... therefore items[0] refers to the first item (granule) in the list/collection
0] items_sorted[
Map Out Selected Tiles
You will now explore global methane emission plumes from point sources and visualize the results on a map using folium.
# Once again, apply the function created above "get_item_count" to the Air-Sea CO2 Flux ECCO-Darwin collection
# This step allows retrieving the number of granules “observations” in the collection.
= get_item_count(collection_name)
number_of_items = requests.get(f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}").json()["features"]
items
# Next, you need to create a dictionary where the "id" field of each item in the collection are queried more explicitly
= {items["id"]: items for items in items}
plume_complexes
# Next, you need to specify the asset name for this collection.
# The asset name refers to the raster band containing the pixel values for the parameter of interest.
# For the case of the EMIT Methane Point Source collection, the parameter of interest is “ch4-plume-emissions”.
= "ch4-plume-emissions" asset_name
Below, you will enter the minimum and maximum values to provide our upper and lower bounds in the rescale_values
.
# Fetching the min and max values for a specific item
= {"max":plume_complexes[list(plume_complexes.keys())[0]]["assets"][asset_name]["raster:bands"][0]["histogram"]["max"], "min":plume_complexes[list(plume_complexes.keys())[0]]["assets"][asset_name]["raster:bands"][0]["histogram"]["min"]} rescale_values
Now, you will pass the item id, collection name, asset name, and the rescaling factor
to the Raster API
endpoint.
# Select the item ID which you want to visualize. Item ID is in the format yyyymmdd followed by the timestamp. This ID can be extracted from the COG name as well.
# To browse and select other tiles in the collection, please visit https://search.earthdata.nasa.gov/search/granules?p=C2748088093-LPCLOUD&pg[0][v]=f&pg[0][gsk]=-start_date&q=emit%20plume&tl=1694622854.77!3!!
# You need to copy the entire granule nomenclature
= "EMIT_L2B_CH4PLM_001_20230418T200118_000829"
item_id
# Choose a color map for displaying the first observation (event)
# Please refer to matplotlib library if you'd prefer to choose a different color ramp.
# For more information on Colormaps in Matplotlib, please visit https://matplotlib.org/stable/users/explain/colors/colormaps.html
= "magma"
color_map
# Make a GET request to retrieve information for the selected tile defined in "item_id"
= requests.get(
methane_plume_tile f"{RASTER_API_URL}/collections/{plume_complexes[item_id]['collection']}/items/{plume_complexes[item_id]['id']}/tilejson.json?"
f"&assets={asset_name}"
# Pass the color formula and colormap for custom visualization
f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
# Pass the minimum and maximum values for rescaling
f"&rescale={rescale_values['min']},{rescale_values['max']}",
# Return the response in JSON format
).json()
# Print the properties of the retrieved granule to the console
methane_plume_tile
# Set a colormap for the granule
# Please refer to matplotlib library if you'd prefer choosing a different color ramp (https://matplotlib.org/stable/users/explain/colors/colormaps.html)
= "magma"
colormap
# Defining the breaks in the colormap
= cm.LinearColormap(colors = ['#310597', '#4C02A1', '#6600A7', '#7E03A8', '#9511A1', '#AA2395', '#BC3587', '#CC4778', '#DA5A6A', '#E66C5C', '#F0804E', '#F89540','#FDAC33', '#FDC527', '#F8DF25'], vmin = 0, vmax = 1500 )
color_map
# Add an appropriate caption, in this case it would be Parts per million meter
= 'ppm-m'
color_map.caption
# Set initial zoom and center of map for plume Layer
= folium.Map(location=(methane_plume_tile["center"][1], methane_plume_tile["center"][0]), zoom_start=14, tiles=None, tooltip = 'test tool tip')
map_ ='https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}.png', name='ESRI World Imagery', attr='Tiles © Esri — Source: Esri, i-cubed, USDA, USGS, AEX, GeoEye, Getmapping, Aerogrid, IGN, IGP, UPR-EGP, and the GIS User Community',overlay='True').add_to(map_)
folium.TileLayer(tiles
# Use the 'TileLayer' library to display the raster layer, add an appropriate caption, and adjust the transparency of the layer on the map
= TileLayer(
map_layer =methane_plume_tile["tiles"][0], # Path to retrieve the tile
tiles='Plume Complex Landfill',
name='True', # The layer can be overlaid on the map
overlay="GHG", # Set the attribution
attr=1, # Adjust the transparency of the layer
opacity
)
map_layer.add_to(map_)
# Adjust map elements
=False, position='bottomleft').add_to(map_)
folium.LayerControl(collapsed
map_.add_child(color_map)= '<style>svg#legend {font-size: 14px; background-color: white;}</style>'
svg_style
map_.get_root().header.add_child(folium.Element(svg_style))
# Visualizing the map
map_
Calculate Zonal Statistics
To perform zonal statistics, you first need to create a polygon. In this use case, you are creating a polygon using the plume’s retrieved coordinates.
# Access the coordinates of the plume feature (using the 'item_id')
= plume_complexes[item_id]["geometry"]["coordinates"]
plumes_coordinates
# Create a polygon for the area of interest (aoi)
= {
methane_plume_aoi "type": "Feature", # Create a feature object
"properties": {},
"geometry": { # The geometry of the feature
"coordinates":
# Use the plume's coordinates retrieved earlier
plumes_coordinates, "type": "Polygon",
}, }
# Please put the name of the place you are trying to visualize.
# The granule that was selected by default is showuing plumes near Denver, United States.
= "Place_Holder"
region_name
# Create a new map to display the generated polygon
= Map(
aoi_map
# Base map is set to OpenStreetMap
="OpenStreetMap",
tiles
# Define the spatial properties for the map
=[
location0][0][1],
plumes_coordinates[0][0][0]
plumes_coordinates[
],
# Set the center of the map
=12,
zoom_start
)
# Insert the polygon to the map
=region_name).add_to(aoi_map)
folium.GeoJson(methane_plume_aoi, name
# Visualize the map
aoi_map
# Check the total number of items available within the collection
= requests.get(
items f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}"
"features"]
).json()[
# Print the total number of items (granules) found
print(f"Found {len(items)} items")
# Examine the first item in the collection
# Keep in mind that a list starts from 0, 1, 2... therefore items[0] refers to the first item (granule) in the list/collection
0] items[
# The bounding box should be passed to the geojson param as a geojson Feature or FeatureCollection
# Create a function that retrieves information regarding a specific granule using its asset name and raster identifier and generates the statistics for it
# The function takes an item (granule) and a JSON (polygon) as input parameters
def generate_stats(item, geojson):
# A POST request is made to submit the data associated with the item of interest (specific observation) within the boundaries of the polygon to compute its statistics
= requests.post(
result
# Raster API Endpoint for computing statistics
f"{RASTER_API_URL}/cog/statistics",
# Pass the URL to the item, asset name, and raster identifier as parameters
={"url": item["assets"][asset_name]["href"]},
params
# Send the GeoJSON object (polygon) along with the request
=geojson,
json
# Return the response in JSON format
).json()
# Print the result
print(result)
# Return a dictionary containing the computed statistics along with the item's datetime information.
return {
**result["properties"],
"item_id": item["id"][20:],
}
# Generate a for loop that iterates over all the existing items in the collection
for item in items:
# The loop will then retrieve the information for the start datetime of each item in the list
#print(item["id"])
print(item["properties"]["datetime"])
# Exit the loop after printing the start datetime for the first item in the collection
break
With the function above, we can generate the statistics for the area of interest.
%%time
# %%time = Wall time (execution time) for running the code below
# Generate statistics using the created function "generate_stats" within the bounding box defined by the aoi polygon
= [generate_stats(item, methane_plume_aoi) for item in items]
stats = [ stat for stat in stats if stat["statistics"]["b1"]["mean"] != None] stats
0] stats[
def clean_stats(stats_json) -> pd.DataFrame:
= pd.json_normalize(stats_json)
df = [col.replace("statistics.b1.", "") for col in df.columns]
df.columns # df["date"] = pd.to_datetime(df["datetime"])
return df
= clean_stats(stats)
df df
= requests.get(
plume_tile_2 f"{RASTER_API_URL}/collections/{items[0]['collection']}/items/{items[0]['id']}/tilejson.json?"
f"&assets={asset_name}"
f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
f"&rescale={rescale_values['min']},{rescale_values['max']}",
).json() plume_tile_2
# Use bbox initial zoom and map
# Set up a map located w/in event bounds
= items[0]["geometry"]["coordinates"]
plume_tile_2_coordinates = Map(
aoi_map_bbox ="OpenStreetMap",
tiles=[
location0][0][1],
plume_tile_2_coordinates[0][0][0]
plume_tile_2_coordinates[
],=10,
zoom_start
)
= TileLayer(
map_layer =plume_tile_2["tiles"][0],
tiles="GHG", opacity = 1
attr
)
map_layer.add_to(aoi_map_bbox)
aoi_map_bbox
Summary
In this notebook we have successfully completed the following steps for the STAC collection for the EMIT Methane Point Source Plume Complexes dataset: 1. Install and import the necessary libraries 2. Fetch the collection from STAC collections using the appropriate endpoints 3. Count the number of existing granules within the collection 4. Map the methane emission plumes 5. Generate statistics for the area of interest (AOI)
If you have any questions regarding this user notebook, please contact us using the feedback form.