gedixr

Global Ecosystem Dynamics Investigation (GEDI) L2A/L2B -> GeoParquet & GeoDataFrame/Xarray

https://github.com/maawoo/gedixr

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 2 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.1%) to scientific vocabulary

Keywords

earth-observation gedi geodataframe geopackage geopandas geoparquet geospatial lidar xarray
Last synced: 6 months ago · JSON representation ·

Repository

Global Ecosystem Dynamics Investigation (GEDI) L2A/L2B -> GeoParquet & GeoDataFrame/Xarray

Basic Info
  • Host: GitHub
  • Owner: maawoo
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 97.7 KB
Statistics
  • Stars: 9
  • Watchers: 1
  • Forks: 1
  • Open Issues: 7
  • Releases: 5
Topics
earth-observation gedi geodataframe geopackage geopandas geoparquet geospatial lidar xarray
Created over 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

gedixr

Extract the variables you need from GEDI L2A/L2B files and start working with them as a geopandas.GeoDataFrame or xarray.Dataset in no time!

Installation

Latest state on GitHub

  1. Create and activate an environment with the required dependencies: bash conda env create --file https://raw.githubusercontent.com/maawoo/gedixr/main/environment.yml conda activate gedixr_env I recommend you to check Mamba/Micromamba as a faster alternative to Conda.

  2. Install the gedixr package into the activated environment: bash pip install git+https://github.com/maawoo/gedixr.git

Specific version

See the Tags section of the repository for available versions to install: bash conda env create --file https://raw.githubusercontent.com/maawoo/gedixr/v0.4.0/environment.yml conda activate gedixr_env pip install git+https://github.com/maawoo/gedixr.git@v0.4.0

Usage

After downloading GEDI L2A/L2B v002 files from NASA Earthdata Search1, you will end up with a bunch of zipped HDF5 files. After unzipping2 them, you can use the extract_data function to recursively find all relevant files in a directory and extract biophysical variables (see subsection for defaults) for each shot to further work with them as geopandas.GeoDataFrame in Python or use the created vector file in your favorite GIS software.

Basic example

The NASA Earthdata Search platform mentioned above allows you to already subset the GEDI data to your area of interest during the download process. This saves you space on disk and the extraction process is quite straightforward in this case: ```python from gedixr.gedi import extract_data

gedidir = "directory/containing/gedi/products" gdfl2a = extractdata(directory=gedidir, gediproduct='L2A') gdfl2b = extractdata(directory=gedidir, gedi_product='L2B') ```

The directory you provide will be searched recursively and only files will be considered that match the product provided via the gedi_product parameter.

If you extracted variables from L2A and L2B files of the same spatial and temporal extents, you can then merge both GeoDataFrames: ```python from gedixr.xr import merge_gdf

gdf = mergegdf(l2a=gdfl2a, l2b=gdf_l2b) ```

If you want to rasterize the GeoDataFrame and use the data as an xarray.Dataset: ```python from gedixr.xr import gdftoxr

ds = gdftoxr(gdf=gdf) ```

If you want to load previously extracted data: ```python from gedixr.xr import loadtogdf

gdf = loadtogdf(l2a="path/to/extracted_l2a.parquet") ```

Custom subsetting

If your GEDI data is not subsetted (i.e., each file covering an entire orbit), you can provide a vector file (e.g. GeoJSON, GeoPackage, etc.) to extract metrics for your area of interest. You can also provide a list of vector files to extract for multiple areas at the same time: ```python from gedixr.gedi import extract_data

l2adict = extractdata(directory="directory/containing/gedi/products", gediproduct='L2A', subsetvector=["path/to/aoi1.geojson", "path/to/aoi2.geojson"]) ```

Please note that if the subset_vector parameter is used, a dictionary with the following key, value pairs is returned: {'<Vector Basename>': {'geo': Polygon, 'gdf': GeoDataFrame}}

Given the above example, you can access the extracted GeoDataFrame of each area like this: python aoi_1_gdf = l2a_dict['aoi_1']['gdf'] aoi_2_gdf = l2a_dict['aoi_2']['gdf']

Extract from specific beams

The beams parameter can be used to specify which beams to extract data from. By default, data will be extracted from all beams (full power and coverage). You can use beams='full' (or 'coverage') to only extract from one or the other. Alternatively, you can provide a list of beam names, e.g.: beams=['BEAM0101', 'BEAM0110']

Current defaults

Extracted variables

In addition to shot number, acquisition time and geolocation information, the following variables are extracted by default if no custom variables are provided via the variables parameter:

L2A: - rh98: Relative height metrics at 98% interval

L2B: - rh100: Height above ground of the received waveform signal start (rh101 from L2A) - tcc: Total canopy cover - fhd: Foliage Height Diversity - pai: Total Plant Area Index

See also the following sources for overviews of the layers contained in each product: L2A and L2B

Quality filtering

The extraction process will automatically apply quality filtering based on the quality_flag, degrade_flag and sensitivity variables using the following default values: - quality_flag == 1 - degrade_flag == 0 - num_detectedmodes > 0 - abs(ele_lowestmode - digital_elevation_model) < 100

Please note that quality_flag already includes filtering to a sensitivity range of 0.9 - 1.0.

If you want to apply a different quality filtering strategy, you can disable the default filtering by setting apply_quality_filter=False and apply your own filtering after the extraction process.

Notes

1See #1 for a related issue regarding the download of GEDI data.

2The products need to be unzipped first which can seriously increase the amount of disk space needed (~90 MB compressed -> ~3 GB uncompressed... per file!). A solution is work in progress and being tracked in #2.

Owner

  • Name: Marco Wolsza
  • Login: maawoo
  • Kind: user
  • Location: Jena, Germany
  • Company: Friedrich Schiller University Jena

PhD student @Jena-Earth-Observation-School

Citation (CITATION.cff)

cff-version: 1.2.0
title: gedixr
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Marco
    family-names: Wolsza
    email: marco.wolsza@uni-jena.de
    affiliation: University of Jena
    orcid: 'https://orcid.org/0000-0002-5231-7208'
identifiers:
  - type: url
    value: 'https://github.com/maawoo/gedixr/tree/v0.4.0'
    description: The URL of version 0.4.0 of the software.
repository-code: 'https://github.com/maawoo/gedixr'
license: MIT
commit: f6742d2
version: 0.4.0
date-released: '2024-08-23'

GitHub Events

Total
  • Create event: 11
  • Release event: 1
  • Issues event: 2
  • Watch event: 1
  • Delete event: 10
  • Issue comment event: 4
  • Push event: 12
  • Pull request event: 16
  • Fork event: 1
Last Year
  • Create event: 11
  • Release event: 1
  • Issues event: 2
  • Watch event: 1
  • Delete event: 10
  • Issue comment event: 4
  • Push event: 12
  • Pull request event: 16
  • Fork event: 1

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 55
  • Total Committers: 2
  • Avg Commits per committer: 27.5
  • Development Distribution Score (DDS): 0.018
Past Year
  • Commits: 55
  • Committers: 2
  • Avg Commits per committer: 27.5
  • Development Distribution Score (DDS): 0.018
Top Committers
Name Email Commits
maawoo m****a@u****e 54
MarkusZehner m****z@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: about 2 years ago

All Time
  • Total issues: 7
  • Total pull requests: 5
  • Average time to close issues: 4 months
  • Average time to close pull requests: 7 days
  • Total issue authors: 1
  • Total pull request authors: 2
  • Average comments per issue: 0.29
  • Average comments per pull request: 0.2
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 7
  • Pull requests: 5
  • Average time to close issues: 4 months
  • Average time to close pull requests: 7 days
  • Issue authors: 1
  • Pull request authors: 2
  • Average comments per issue: 0.29
  • Average comments per pull request: 0.2
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • maawoo (13)
  • diesseh (2)
Pull Request Authors
  • maawoo (21)
  • AntjeUhde (2)
  • MarkusZehner (1)
Top Labels
Issue Labels
enhancement (8) documentation (3) bug (2) good first issue (1)
Pull Request Labels
enhancement (8) bug (2)