gedixr
Global Ecosystem Dynamics Investigation (GEDI) L2A/L2B -> GeoParquet & GeoDataFrame/Xarray
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 2 committers (50.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary
Keywords
Repository
Global Ecosystem Dynamics Investigation (GEDI) L2A/L2B -> GeoParquet & GeoDataFrame/Xarray
Basic Info
Statistics
- Stars: 9
- Watchers: 1
- Forks: 1
- Open Issues: 7
- Releases: 5
Topics
Metadata Files
README.md
gedixr
Extract the variables you need from GEDI L2A/L2B files
and start working with them as a geopandas.GeoDataFrame or xarray.Dataset in
no time!
Installation
Latest state on GitHub
Create and activate an environment with the required dependencies:
bash conda env create --file https://raw.githubusercontent.com/maawoo/gedixr/main/environment.yml conda activate gedixr_envI recommend you to check Mamba/Micromamba as a faster alternative to Conda.Install the
gedixrpackage into the activated environment:bash pip install git+https://github.com/maawoo/gedixr.git
Specific version
See the Tags section of the repository
for available versions to install:
bash
conda env create --file https://raw.githubusercontent.com/maawoo/gedixr/v0.4.0/environment.yml
conda activate gedixr_env
pip install git+https://github.com/maawoo/gedixr.git@v0.4.0
Usage
After downloading GEDI L2A/L2B v002 files from NASA Earthdata Search1,
you will end up with a bunch of zipped HDF5 files. After unzipping2 them,
you can use the extract_data function to recursively find all relevant files in
a directory and extract biophysical variables (see subsection
for defaults) for each shot to further work with them as geopandas.GeoDataFrame
in Python or use the created vector file in your favorite GIS software.
Basic example
The NASA Earthdata Search platform mentioned above allows you to already subset the GEDI data to your area of interest during the download process. This saves you space on disk and the extraction process is quite straightforward in this case: ```python from gedixr.gedi import extract_data
gedidir = "directory/containing/gedi/products" gdfl2a = extractdata(directory=gedidir, gediproduct='L2A') gdfl2b = extractdata(directory=gedidir, gedi_product='L2B') ```
The directory you provide will be searched recursively and only files will be
considered that match the product provided via the gedi_product parameter.
If you extracted variables from L2A and L2B files of the same spatial and temporal extents, you can then merge both GeoDataFrames: ```python from gedixr.xr import merge_gdf
gdf = mergegdf(l2a=gdfl2a, l2b=gdf_l2b) ```
If you want to rasterize the GeoDataFrame and use the data as an xarray.Dataset:
```python
from gedixr.xr import gdftoxr
ds = gdftoxr(gdf=gdf) ```
If you want to load previously extracted data: ```python from gedixr.xr import loadtogdf
gdf = loadtogdf(l2a="path/to/extracted_l2a.parquet") ```
Custom subsetting
If your GEDI data is not subsetted (i.e., each file covering an entire orbit), you can provide a vector file (e.g. GeoJSON, GeoPackage, etc.) to extract metrics for your area of interest. You can also provide a list of vector files to extract for multiple areas at the same time: ```python from gedixr.gedi import extract_data
l2adict = extractdata(directory="directory/containing/gedi/products", gediproduct='L2A', subsetvector=["path/to/aoi1.geojson", "path/to/aoi2.geojson"]) ```
Please note that if the subset_vector parameter is used, a dictionary with the
following key, value pairs is returned:
{'<Vector Basename>': {'geo': Polygon, 'gdf': GeoDataFrame}}
Given the above example, you can access the extracted GeoDataFrame of each area
like this:
python
aoi_1_gdf = l2a_dict['aoi_1']['gdf']
aoi_2_gdf = l2a_dict['aoi_2']['gdf']
Extract from specific beams
The beams parameter can be used to specify which beams to extract data from.
By default, data will be extracted from all beams (full power and coverage). You
can use beams='full' (or 'coverage') to only extract from one or the other.
Alternatively, you can provide a list of beam names, e.g.:
beams=['BEAM0101', 'BEAM0110']
Current defaults
Extracted variables
In addition to shot number, acquisition time and geolocation information, the
following variables are extracted by default if no custom variables are provided
via the variables parameter:
L2A:
- rh98: Relative height metrics at 98% interval
L2B:
- rh100: Height above ground of the received waveform signal start (rh101 from L2A)
- tcc: Total canopy cover
- fhd: Foliage Height Diversity
- pai: Total Plant Area Index
See also the following sources for overviews of the layers contained in each product: L2A and L2B
Quality filtering
The extraction process will automatically apply quality filtering based on the
quality_flag, degrade_flag and sensitivity variables using the following
default values:
- quality_flag == 1
- degrade_flag == 0
- num_detectedmodes > 0
- abs(ele_lowestmode - digital_elevation_model) < 100
Please note that quality_flag already includes filtering to a sensitivity
range of 0.9 - 1.0.
If you want to apply a different quality filtering strategy, you can disable the
default filtering by setting apply_quality_filter=False and apply your own filtering
after the extraction process.
Notes
1See #1 for a related issue regarding the download of GEDI data.
2The products need to be unzipped first which can seriously increase the amount of disk space needed (~90 MB compressed -> ~3 GB uncompressed... per file!). A solution is work in progress and being tracked in #2.
Owner
- Name: Marco Wolsza
- Login: maawoo
- Kind: user
- Location: Jena, Germany
- Company: Friedrich Schiller University Jena
- Twitter: maaawoo
- Repositories: 2
- Profile: https://github.com/maawoo
PhD student @Jena-Earth-Observation-School
Citation (CITATION.cff)
cff-version: 1.2.0
title: gedixr
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Marco
family-names: Wolsza
email: marco.wolsza@uni-jena.de
affiliation: University of Jena
orcid: 'https://orcid.org/0000-0002-5231-7208'
identifiers:
- type: url
value: 'https://github.com/maawoo/gedixr/tree/v0.4.0'
description: The URL of version 0.4.0 of the software.
repository-code: 'https://github.com/maawoo/gedixr'
license: MIT
commit: f6742d2
version: 0.4.0
date-released: '2024-08-23'
GitHub Events
Total
- Create event: 11
- Release event: 1
- Issues event: 2
- Watch event: 1
- Delete event: 10
- Issue comment event: 4
- Push event: 12
- Pull request event: 16
- Fork event: 1
Last Year
- Create event: 11
- Release event: 1
- Issues event: 2
- Watch event: 1
- Delete event: 10
- Issue comment event: 4
- Push event: 12
- Pull request event: 16
- Fork event: 1
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| maawoo | m****a@u****e | 54 |
| MarkusZehner | m****z@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: about 2 years ago
All Time
- Total issues: 7
- Total pull requests: 5
- Average time to close issues: 4 months
- Average time to close pull requests: 7 days
- Total issue authors: 1
- Total pull request authors: 2
- Average comments per issue: 0.29
- Average comments per pull request: 0.2
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 7
- Pull requests: 5
- Average time to close issues: 4 months
- Average time to close pull requests: 7 days
- Issue authors: 1
- Pull request authors: 2
- Average comments per issue: 0.29
- Average comments per pull request: 0.2
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- maawoo (13)
- diesseh (2)
Pull Request Authors
- maawoo (21)
- AntjeUhde (2)
- MarkusZehner (1)