gedixr

Global Ecosystem Dynamics Investigation (GEDI) L2A/L2B -> GeoParquet & GeoDataFrame/Xarray

https://github.com/maawoo/gedixr

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
1 of 2 committers (50.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary

Keywords

earth-observation gedi geodataframe geopackage geopandas geoparquet geospatial lidar xarray

Last synced: 6 months ago · JSON representation ·

Repository

Global Ecosystem Dynamics Investigation (GEDI) L2A/L2B -> GeoParquet & GeoDataFrame/Xarray

Basic Info

Host: GitHub
Owner: maawoo
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 97.7 KB

Statistics

Stars: 9
Watchers: 1
Forks: 1
Open Issues: 7
Releases: 5

Topics

earth-observation gedi geodataframe geopackage geopandas geoparquet geospatial lidar xarray

Created over 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme License Citation

gedixr

Extract the variables you need from GEDI L2A/L2B files and start working with them as a geopandas.GeoDataFrame or xarray.Dataset in no time!

Installation

Latest state on GitHub

Create and activate an environment with the required dependencies: bash conda env create --file https://raw.githubusercontent.com/maawoo/gedixr/main/environment.yml conda activate gedixr_env I recommend you to check Mamba/Micromamba as a faster alternative to Conda.
Install the gedixr package into the activated environment: bash pip install git+https://github.com/maawoo/gedixr.git

Specific version

See the Tags section of the repository for available versions to install: bash conda env create --file https://raw.githubusercontent.com/maawoo/gedixr/v0.4.0/environment.yml conda activate gedixr_env pip install git+https://github.com/maawoo/gedixr.git@v0.4.0

Usage

After downloading GEDI L2A/L2B v002 files from NASA Earthdata Search¹, you will end up with a bunch of zipped HDF5 files. After unzipping² them, you can use the extract_data function to recursively find all relevant files in a directory and extract biophysical variables (see subsection for defaults) for each shot to further work with them as geopandas.GeoDataFrame in Python or use the created vector file in your favorite GIS software.

Basic example

The NASA Earthdata Search platform mentioned above allows you to already subset the GEDI data to your area of interest during the download process. This saves you space on disk and the extraction process is quite straightforward in this case: ```python from gedixr.gedi import extract_data

gedidir = "directory/containing/gedi/products" gdfl2a = extractdata(directory=gedidir, gediproduct='L2A') gdfl2b = extractdata(directory=gedidir, gedi_product='L2B') ```

The directory you provide will be searched recursively and only files will be considered that match the product provided via the gedi_product parameter.

If you extracted variables from L2A and L2B files of the same spatial and temporal extents, you can then merge both GeoDataFrames: ```python from gedixr.xr import merge_gdf

gdf = mergegdf(l2a=gdfl2a, l2b=gdf_l2b) ```

If you want to rasterize the GeoDataFrame and use the data as an xarray.Dataset: ```python from gedixr.xr import gdftoxr

ds = gdftoxr(gdf=gdf) ```

If you want to load previously extracted data: ```python from gedixr.xr import loadtogdf

gdf = loadtogdf(l2a="path/to/extracted_l2a.parquet") ```

Custom subsetting

If your GEDI data is not subsetted (i.e., each file covering an entire orbit), you can provide a vector file (e.g. GeoJSON, GeoPackage, etc.) to extract metrics for your area of interest. You can also provide a list of vector files to extract for multiple areas at the same time: ```python from gedixr.gedi import extract_data

l2adict = extractdata(directory="directory/containing/gedi/products", gediproduct='L2A', subsetvector=["path/to/aoi1.geojson", "path/to/aoi2.geojson"]) ```

Please note that if the subset_vector parameter is used, a dictionary with the following key, value pairs is returned: {'<Vector Basename>': {'geo': Polygon, 'gdf': GeoDataFrame}}

Given the above example, you can access the extracted GeoDataFrame of each area like this: python aoi_1_gdf = l2a_dict['aoi_1']['gdf'] aoi_2_gdf = l2a_dict['aoi_2']['gdf']

Extract from specific beams

The beams parameter can be used to specify which beams to extract data from. By default, data will be extracted from all beams (full power and coverage). You can use beams='full' (or 'coverage') to only extract from one or the other. Alternatively, you can provide a list of beam names, e.g.: beams=['BEAM0101', 'BEAM0110']

Current defaults

Extracted variables

In addition to shot number, acquisition time and geolocation information, the following variables are extracted by default if no custom variables are provided via the variables parameter:

L2A: - rh98: Relative height metrics at 98% interval

L2B: - rh100: Height above ground of the received waveform signal start (rh101 from L2A) - tcc: Total canopy cover - fhd: Foliage Height Diversity - pai: Total Plant Area Index

See also the following sources for overviews of the layers contained in each product: L2A and L2B

Quality filtering

The extraction process will automatically apply quality filtering based on the quality_flag, degrade_flag and sensitivity variables using the following default values: - quality_flag == 1 - degrade_flag == 0 - num_detectedmodes > 0 - abs(ele_lowestmode - digital_elevation_model) < 100

Please note that quality_flag already includes filtering to a sensitivity range of 0.9 - 1.0.

If you want to apply a different quality filtering strategy, you can disable the default filtering by setting apply_quality_filter=False and apply your own filtering after the extraction process.

Notes

¹See #1 for a related issue regarding the download of GEDI data.

²The products need to be unzipped first which can seriously increase the amount of disk space needed (~90 MB compressed -> ~3 GB uncompressed... per file!). A solution is work in progress and being tracked in #2.

Owner

Name: Marco Wolsza
Login: maawoo
Kind: user
Location: Jena, Germany
Company: Friedrich Schiller University Jena

Twitter: maaawoo
Repositories: 2
Profile: https://github.com/maawoo

PhD student @Jena-Earth-Observation-School

Citation (CITATION.cff)

cff-version: 1.2.0
title: gedixr
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Marco
    family-names: Wolsza
    email: marco.wolsza@uni-jena.de
    affiliation: University of Jena
    orcid: 'https://orcid.org/0000-0002-5231-7208'
identifiers:
  - type: url
    value: 'https://github.com/maawoo/gedixr/tree/v0.4.0'
    description: The URL of version 0.4.0 of the software.
repository-code: 'https://github.com/maawoo/gedixr'
license: MIT
commit: f6742d2
version: 0.4.0
date-released: '2024-08-23'

GitHub Events

Total

Create event: 11
Release event: 1
Issues event: 2
Watch event: 1
Delete event: 10
Issue comment event: 4
Push event: 12
Pull request event: 16
Fork event: 1

Last Year

Create event: 11
Release event: 1
Issues event: 2
Watch event: 1
Delete event: 10
Issue comment event: 4
Push event: 12
Pull request event: 16
Fork event: 1

Committers

Last synced: about 2 years ago

All Time

Total Commits: 55
Total Committers: 2
Avg Commits per committer: 27.5
Development Distribution Score (DDS): 0.018

Past Year

Commits: 55
Committers: 2
Avg Commits per committer: 27.5
Development Distribution Score (DDS): 0.018

Top Committers

Name	Email	Commits
maawoo	m**a@u**e	54
MarkusZehner	m**z@g**m	1

Committer Domains (Top 20 + Academic)

uni-jena.de: 1

Issues and Pull Requests

Last synced: about 2 years ago

All Time

Total issues: 7
Total pull requests: 5
Average time to close issues: 4 months
Average time to close pull requests: 7 days
Total issue authors: 1
Total pull request authors: 2
Average comments per issue: 0.29
Average comments per pull request: 0.2
Merged pull requests: 5
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 7
Pull requests: 5
Average time to close issues: 4 months
Average time to close pull requests: 7 days
Issue authors: 1
Pull request authors: 2
Average comments per issue: 0.29
Average comments per pull request: 0.2
Merged pull requests: 5
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

maawoo (13)
diesseh (2)

Pull Request Authors

maawoo (21)
AntjeUhde (2)
MarkusZehner (1)

Top Labels

Issue Labels

enhancement (8) documentation (3) bug (2) good first issue (1)

Pull Request Labels

enhancement (8) bug (2)

gedixr

Science Score: 54.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

gedixr

Installation

Latest state on GitHub

Specific version

Usage

Basic example

Custom subsetting

Extract from specific beams

Current defaults

Extracted variables

Quality filtering

Notes

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels