# Import the following libraries
import requests
import folium
import folium.plugins
from folium import Map, TileLayer
from pystac_client import Client
import branca
import pandas as pd
import matplotlib.pyplot as plt
U.S. Gridded Anthropogenic Methane Emissions Inventory
Approach
- Identify available dates and temporal frequency of observations for the given collection using the GHGC API
/stac
endpoint. The collection processed in this notebook is the gridded methane emissions data product. - Pass the STAC item into the raster API
/stac/tilejson.json
endpoint. - Using
folium.plugins.DualMap
, we will visualize two tiles (side-by-side), allowing us to compare time points. - After the visualization, we will perform zonal statistics for a given polygon.
About the Data
The gridded EPA U.S. anthropogenic methane greenhouse gas inventory (gridded GHGI) includes spatially disaggregated (0.1 deg x 0.1 deg or approximately 10 x 10 km resolution) maps of annual anthropogenic methane emissions (for the contiguous United States (CONUS), consistent with national annual U.S. anthropogenic methane emissions reported in the U.S. EPA Inventory of U.S. Greenhouse Gas Emissions and Sinks (U.S. GHGI). This V2 Express Extension dataset contains methane emissions provided as fluxes, in units of molecules of methane per square cm per second, for over 25 individual emission source categories, including those from agriculture, petroleum and natural gas systems, coal mining, and waste. The data have been converted from their original NetCDF format to Cloud-Optimized GeoTIFF (COG) for use in the US GHG Center, thereby enabling user exploration of spatial anthropogenic methane emissions and their trends.
For more information regarding this dataset, please visit the U.S. Gridded Anthropogenic Methane Emissions Inventory data overview page.
Install the Required Libraries
Required libraries are pre-installed on the GHG Center Hub. If you need to run this notebook elsewhere, please install them with this line in a code cell:
%pip install requests folium rasterstats pystac_client pandas matplotlib –quiet
Querying the STAC API
First, we are going to import the required libraries. Once imported, they allow better executing a query in the GHG Center Spatio Temporal Asset Catalog (STAC) Application Programming Interface (API) where the granules for this collection are stored.
# Provide the STAC and RASTER API endpoints
# The endpoint is referring to a location within the API that executes a request on a data collection nesting on the server.
# The STAC API is a catalog of all the existing data collections that are stored in the GHG Center.
= "http://ghg.center/api/stac"
STAC_API_URL
# The RASTER API is used to fetch collections for visualization
= "https://ghg.center/api/raster"
RASTER_API_URL
# The collection name is used to fetch the dataset from the STAC API. First, we define the collection name as a variable
# Name of the collection for gridded methane dataset
= "epa-ch4emission-yeargrid-v2express" collection_name
# Fetch the collection from the STAC API using the appropriate endpoint
# The 'requests' library allows a HTTP request possible
= requests.get(f"{STAC_API_URL}/collections/{collection_name}").json()
collection
# Print the properties of the collection to the console
collection
Examining the contents of our collection
under the temporal
variable, we see that the data is available from January 2012 to December 2020. By looking at the dashboard:time density
, we observe that the periodic frequency of these observations is yearly.
# Create a function that would search for a data collection in the US GHG Center STAC API
# First, we need to define the function
# The name of the function = "get_item_count"
# The argument that will be passed through the defined function = "collection_id"
def get_item_count(collection_id):
# Set a counter for the number of items existing in the collection
= 0
count
# Define the path to retrieve the granules (items) of the collection of interest in the STAC API
= f"{STAC_API_URL}/collections/{collection_id}/items"
items_url
# Run a while loop to make HTTP requests until there are no more URLs associated with the collection in the STAC API
while True:
# Retrieve information about the granules by sending a "get" request to the STAC API using the defined collection path
= requests.get(items_url)
response
# If the items do not exist, print an error message and quit the loop
if not response.ok:
print("error getting items")
exit()
# Return the results of the HTTP response as JSON
= response.json()
stac
# Increase the "count" by the number of items (granules) returned in the response
+= int(stac["context"].get("returned", 0))
count
# Retrieve information about the next URL associated with the collection in the STAC API (if applicable)
next = [link for link in stac["links"] if link["rel"] == "next"]
# Exit the loop if there are no other URLs
if not next:
break
# Ensure the information gathered by other STAC API links associated with the collection are added to the original path
# "href" is the identifier for each of the tiles stored in the STAC API
= next[0]["href"]
items_url
# Return the information about the total number of granules found associated with the collection
return count
# Apply the function created above "get_item_count" to the data collection
= get_item_count(collection_name)
number_of_items
# Get the information about the number of granules found in the collection
= requests.get(f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}").json()["features"]
items
# Print the total number of items (granules) found
print(f"Found {len(items)} items")
This makes sense as there are 9 years between 2012 - 2020, meaning 9 records in total.
# Examine the first item in the collection
# Keep in mind that a list starts from 0, 1, 2... therefore items[0] is referring to the first item in the list/collection
0] items[
Exploring Changes in Methane (CH4) Levels Using the Raster API
In this notebook, we will explore the impacts of methane emissions and by examining changes over time in urban regions. We will visualize the outputs on a map using folium
.
# Now we create a dictionary where the start datetime values for each granule is queried more explicitly by year and month (e.g., 2020-02)
= {item["properties"]["datetime"][:7]: item for item in items}
items
# Next, we need to specify the asset name for this collection
# The asset name is referring to the raster band containing the pixel values for the parameter of interest
# For the case of the U.S. Gridded Anthropogenic Methane Emissions Inventory collection, the parameter of interest is “surface-coal”
= "surface-coal" asset_name
Below, we enter minimum and maximum values to provide our upper and lower bounds in rescale_values
.
# Fetching the min and max values for a specific item
= {"max":items[list(items.keys())[0]]["assets"][asset_name]["raster:bands"][0]["histogram"]["max"], "min":items[list(items.keys())[0]]["assets"][asset_name]["raster:bands"][0]["histogram"]["min"]} rescale_values
items
Now, we will pass the item id, collection name, asset name, and the rescaling factor
to the Raster API
endpoint. We will do this twice, once for January 2018 and again for January 2012, so that we can visualize each event independently.
# Choose a color map for displaying the first observation (event)
# Please refer to matplotlib library if you'd prefer choosing a different color ramp.
# For more information on Colormaps in Matplotlib, please visit https://matplotlib.org/stable/users/explain/colors/colormaps.html
= "rainbow"
color_map
# Make a GET request to retrieve information for the 2018 tile
= requests.get(
january_2018_tile
# Pass the collection name, the item number in the list, and its ID
f"{RASTER_API_URL}/stac/tilejson.json?collection={items['2018-01']['collection']}&item={items['2018-01']['id']}"
# Pass the asset name
f"&assets={asset_name}"
# Pass the color formula and colormap for custom visualization
f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
# Pass the minimum and maximum values for rescaling
f"&rescale={rescale_values['min']},{rescale_values['max']}",
# Return the response in JSON format
).json()
# Print the properties of the retrieved granule to the console
january_2018_tile
# Make a GET request to retrieve information for the 2012 tile
= requests.get(
january_2012_tile
# Pass the collection name, the item number in the list, and its ID
f"{RASTER_API_URL}/stac/tilejson.json?collection={items['2012-01']['collection']}&item={items['2012-01']['id']}"
# Pass the asset name
f"&assets={asset_name}"
# Pass the color formula and colormap for custom visualization
f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
# Pass the minimum and maximum values for rescaling
f"&rescale={rescale_values['min']},{rescale_values['max']}",
# Return the response in JSON format
).json()
# Print the properties of the retrieved granule to the console
january_2012_tile
Visualizing CH₄ emissions
# Set initial zoom and center of map for CH₄ Layer
# Centre of map [latitude,longitude]
# 'folium.plugins' allows mapping side-by-side
= folium.plugins.DualMap(location=(34, -118), zoom_start=6)
map_
# Define the first map layer (January 2018)
= TileLayer(
map_layer_2018 =january_2018_tile["tiles"][0], # Path to retrieve the tile
tiles="GHG", # Set the attribution
attr=0.7, # Adjust the transparency of the layer
opacity
)
# Add the first layer to the Dual Map
map_layer_2018.add_to(map_.m1)
# Define the second map layer (January 2012)
= TileLayer(
map_layer_2012 =january_2012_tile["tiles"][0], # Path to retrieve the tile
tiles="GHG", # Set the attribution
attr=0.7, # Adjust the transparency of the layer
opacity
)
# Add the second layer to the Dual Map
map_layer_2012.add_to(map_.m2)
# Visualize the Dual Map
map_
Calculating Zonal Statistics
To perform zonal statistics, first we need to create a polygon. In this use case we are creating a polygon in Texas (USA).
# Texas, USA
= {
texas_aoi "type": "Feature", # Create a feature object
"properties": {},
"geometry": { # Set the bounding coordinates for the polygon
"coordinates": [
[# [13.686159004559698, -21.700046934333145],
# [13.686159004559698, -23.241974326585833],
# [14.753560168039911, -23.241974326585833],
# [14.753560168039911, -21.700046934333145],
# [13.686159004559698, -21.700046934333145],
-95, 29], # South-east bounding coordinate
[-95, 33], # North-east bounding coordinate
[-104,33], # North-west bounding coordinate
[-104,29], # South-west bounding coordinate
[-95, 29] # South-east bounding coordinate (closing the polygon)
[
]
],"type": "Polygon",
}, }
# Create a new map to display the generated polygon
# We'll plug in the coordinates for a location
# Central to the study area and a reasonable zoom level
= Map(
aoi_map
# Base map is set to OpenStreetMap
="OpenStreetMap",
tiles
# Define the spatial properties for the map
=[
location30,-100
],
# Set the zoom value
=6,
zoom_start
)
# Insert the polygon to the map
="Texas, USA").add_to(aoi_map)
folium.GeoJson(texas_aoi, name
# Visualize the map
aoi_map
# Check total number of items available within the collection
= requests.get(
items f"{STAC_API_URL}/collections/{collection_name}/items?limit=300"
"features"]
).json()[
# Print the total number of items (granules) found
print(f"Found {len(items)} items")
# Examine the first item in the collection
0] items[
Now that we created the polygon for the area of interest, we need to develop a function that runs through the data collection and generates the statistics for a specific item (granule) within the boundaries of the AOI polygon.
# The bounding box should be passed to the geojson param as a geojson Feature or FeatureCollection
# Create a function that retrieves information regarding a specific granule using its asset name and raster identifier and generates the statistics for it
# The function takes an item (granule) and a JSON (polygon) as input parameters
def generate_stats(item, geojson):
# A POST request is made to submit the data associated with the item of interest (specific observation) within the boundaries of the polygon to compute its statistics
= requests.post(
result
# Raster API Endpoint for computing statistics
f"{RASTER_API_URL}/cog/statistics",
# Pass the URL to the item, asset name, and raster identifier as parameters
={"url": item["assets"][asset_name]["href"]},
params
# Send the GeoJSON object (polygon) along with the request
=geojson,
json
# Return the response in JSON format
).json()
# Return a dictionary containing the computed statistics along with the item's datetime information.
return {
**result["properties"],
"datetime": item["properties"]["datetime"],
}
With the function above we can generate the statistics for the AOI.
%%time
# %%time = Wall time (execution time) for running the code below
# Generate statistics using the created function "generate_stats" within the bounding box defined by the polygon
= [generate_stats(item, texas_aoi) for item in items] stats
# Print the stats for the first item in the collection
0] stats[
Create a function that goes through every single item in the collection and populates their properties - including the minimum, maximum, and sum of their values - in a table.
# Create a function that converts statistics in JSON format into a pandas DataFrame
def clean_stats(stats_json) -> pd.DataFrame:
# Normalize the JSON data
= pd.json_normalize(stats_json)
df
# Replace the naming "statistics.b1" in the columns
= [col.replace("statistics.b1.", "") for col in df.columns]
df.columns
# Set the datetime format
"date"] = pd.to_datetime(df["datetime"])
df[
# Return the cleaned format
return df
# Apply the generated function on the stats data
= clean_stats(stats)
df
# Display the stats for the first 5 granules in the collection in the table
# Change the value in the parenthesis to show more or a smaller number of rows in the table
5) df.head(
Visualizing the Data as a Time Series
We can now explore the gridded methane emission (Domestic Wastewater Treatment & Discharge (5D)) time series (January 2000 -December 2021) available for the Dallas, Texas area of the U.S. We can plot the data set using the code below:
# Figure size: 20 representing the width, 10 representing the height
= plt.figure(figsize=(20, 10))
fig
plt.plot("date"], # X-axis: sorted date
df["max"], # Y-axis: maximum CH4 emission
df[="red", # Line color
color="-", # Line style
linestyle=0.5, # Line width
linewidth="Max CH4 emissions", # Legend label
label
)
# Display legend
plt.legend()
# Insert label for the X-axis
"Years")
plt.xlabel(
# Insert label for the Y-axis
"CH4 emissions Molecules CH₄/cm²/s")
plt.ylabel(
# Insert title for the plot
"CH4 gridded methane emission from Domestic Wastewater Treatment & Discharge (5D) for Texas, Dallas (2012-202)") plt.title(
# Print the properties for the 3rd item in the collection
print(items[2]["properties"]["datetime"])
# A GET request is made for the 2016 tile
= requests.get(
tile_2016
# Pass the collection name, the item number in the list, and its ID
f"{RASTER_API_URL}/stac/tilejson.json?collection={items[2]['collection']}&item={items[2]['id']}"
# Pass the asset name
f"&assets={asset_name}"
# Pass the color formula and colormap for custom visualization
f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
# Pass the minimum and maximum values for rescaling
f"&rescale={rescale_values['min']},{rescale_values['max']}",
# Return the response in JSON format
).json()
# Print the properties of the retrieved granule to the console
tile_2016
# Create a new map to display the 2016 tile
= Map(
aoi_map_bbox
# Base map is set to OpenStreetMap
="OpenStreetMap",
tiles
# Set the center of the map
=[
location30,-100
],
# Set the zoom value
=8,
zoom_start
)
# Define the map layer
= TileLayer(
map_layer
# Path to retrieve the tile
=tile_2016["tiles"][0],
tiles
# Set the attribution and adjust the transparency of the layer
="GHG", opacity = 0.5
attr
)
# Add the layer to the map
map_layer.add_to(aoi_map_bbox)
# Visualize the map
aoi_map_bbox
Summary
In this notebook we have successfully completed the following steps for the STAC collection for the U.S. Gridded Anthropogenic Methane Emissions Inventory dataset:
- Install and import the necessary libraries
- Fetch the collection from STAC collections using the appropriate endpoints
- Count the number of existing granules within the collection
- Map and compare the anthropogenic methane emissions for two distinctive months/years
- Generate zonal statistics for the area of interest (AOI)
- Generate a time-series graph of the anthropogenic methane emissions for a specified region
If you have any questions regarding this user notebook, please contact us using the feedback form.