Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.3%) to scientific vocabulary
Keywords
Repository
On-Demand Earth System Data Cubes (ESDCs) in Python
Basic Info
- Host: GitHub
- Owner: ESDS-Leipzig
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://cubo.readthedocs.io/en/latest/
- Size: 1.63 MB
Statistics
- Stars: 192
- Watchers: 5
- Forks: 14
- Open Issues: 2
- Releases: 9
Topics
Metadata Files
README.md
On-Demand Earth System Data Cubes (ESDCs) in Python
GitHub: https://github.com/davemlz/cubo
Documentation: https://cubo.readthedocs.io/
PyPI: https://pypi.org/project/cubo/
Conda-forge: https://anaconda.org/conda-forge/cubo
Tutorials: https://cubo.readthedocs.io/en/latest/tutorials.html
Paper: https://arxiv.org/abs/2404.13105
News
[!IMPORTANT]
:star: Pinned (2024-04-19): Ourcubopaper (preprint) is out in arXiv! Check it here: Montero, D., Aybar, C., Ji, C., Kraemer, G., Sochting, M., Teber, K., & Mahecha, M.D. (2024). On-Demand Earth System Data Cubes.
Overview
SpatioTemporal Asset Catalogs (STAC) provide a standardized format that describes geospatial information. Multiple platforms are using this standard to provide clients several datasets. Nice platforms such as Planetary Computer use this standard. Additionally, Google Earth Engine (GEE) also provides a gigantic catalogue that users can harness for different tasks in Python.
cubo is a Python package that provides users of STAC and GEE an easy way to create On-Demand Earth System Data Cubes (ESDCs). This is perfectly suitable for Deep Learning (DL) tasks. You can easily create a lot of ESDCs by just knowing a pair of coordinates and the edge size of the cube in pixels!
Check the simple usage of cubo with STAC here:
```python import cubo import xarray as xr
da = cubo.create( lat=4.31, # Central latitude of the cube lon=-76.2, # Central longitude of the cube collection="sentinel-2-l2a", # Name of the STAC collection bands=["B02","B03","B04"], # Bands to retrieve startdate="2021-06-01", # Start date of the cube enddate="2021-06-10", # End date of the cube edge_size=64, # Edge size of the cube (px) resolution=10, # Pixel size of the cube (m) ) ```

This chunk of code just created an xr.DataArray object given a pair of coordinates, the edge size of the cube (in pixels), and additional information to get the data from STAC (Planetary Computer by default, but you can use another provider!). Note that you can also use the resolution you want (in meters) and the bands that you require.
Now check the simple usage of cubo with GEE here:
```python import cubo import xarray as xr
da = cubo.create( lat=51.079225, # Central latitude of the cube lon=10.452173, # Central longitude of the cube collection="COPERNICUS/S2SRHARMONIZED", # Id of the GEE collection bands=["B2","B3","B4"], # Bands to retrieve startdate="2016-06-01", # Start date of the cube enddate="2017-07-01", # End date of the cube edge_size=128, # Edge size of the cube (px) resolution=10, # Pixel size of the cube (m) gee=True # Use GEE instead of STAC ) ```
This chunk of code is very similar to the STAC-based cubo code. Note that the collection
is now the ID of the GEE collection to use, and note that the gee argument must be set to
True.
How does it work?
The thing is super easy and simple.
- You have the coordinates of a point of interest. The cube will be created around these coordinates (i.e., these coordinates will be approximately the spatial center of the cube).
- Internally, the coordinates are transformed to the projected UTM coordinates [x,y] in meters (i.e., local UTM CRS). They are rounded to the closest pair of coordinates that are divisible by the resolution you requested.
- The edge size you provide is used to create a Bounding Box (BBox) for the cube in the local UTM CRS given the exact amount of pixels (Note that the edge size should be a multiple of 2, otherwise it will be rounded, usual edge sizes for ML are 64, 128, 256, 512, etc.).
- Additional information is used to retrieve the data from the STAC catalogue or from GEE: starts and end dates, name of the collection, endpoint of the catalogue (ignored for GEE), etc.
- Then, by using
stackstacandpystac_clientthe cube is retrieved as axr. DataArray. In the case of GEE, the cube is retrieved viaxee. - Success! That's what
cubois doing for you, and you just need to provide the coordinates, the edge size, and the additional info to get the cube.
Installation
Install the latest version from PyPI:
pip install cubo
Install cubo with the required GEE dependencies from PyPI:
pip install cubo[ee]
Upgrade cubo by running:
pip install -U cubo
Install the latest version from conda-forge:
conda install -c conda-forge cubo
Install the latest dev version from GitHub by running:
pip install git+https://github.com/davemlz/cubo
Features
Main function: create()
cubo is pretty straightforward, everything you need is in the create() function:
python
da = cubo.create(
lat=4.31,
lon=-76.2,
collection="sentinel-2-l2a",
bands=["B02","B03","B04"],
start_date="2021-06-01",
end_date="2021-06-10",
edge_size=64,
resolution=10,
)
Using different units for edge_size
By default, the units of edge_size are pixels. But you can modify this using the units argument:
python
da = cubo.create(
lat=4.31,
lon=-76.2,
collection="sentinel-2-l2a",
bands=["B02","B03","B04"],
start_date="2021-06-01",
end_date="2021-06-10",
edge_size=1500,
units="m",
resolution=10,
)
[!TIP] You can use "px" (pixels), "m" (meters), or any unit available in
scipy.constants.
python
da = cubo.create(
lat=4.31,
lon=-76.2,
collection="sentinel-2-l2a",
bands=["B02","B03","B04"],
start_date="2021-06-01",
end_date="2021-06-10",
edge_size=1.5,
units="kilo",
resolution=10,
)
Using another endpoint
By default, cubo uses Planetary Computer. But you can use another STAC provider endpoint if you want:
python
da = cubo.create(
lat=4.31,
lon=-76.2,
collection="sentinel-s2-l2a-cogs",
bands=["B05","B06","B07"],
start_date="2020-01-01",
end_date="2020-06-01",
edge_size=128,
resolution=20,
stac="https://earth-search.aws.element84.com/v0"
)
Keywords for searching data
You can pass kwargs to pystac_client.Client.search() if required:
python
da = cubo.create(
lat=4.31,
lon=-76.2,
collection="sentinel-2-l2a",
bands=["B02","B03","B04"],
start_date="2021-01-01",
end_date="2021-06-10",
edge_size=64,
resolution=10,
query={"eo:cloud_cover": {"lt": 10}} # kwarg to pass
)
License
The project is licensed under the MIT license.
Citation
If you use this work, please consider citing the following paper:
bibtex
@article{montero2024cubo,
doi = {10.48550/ARXIV.2404.13105},
url = {https://arxiv.org/abs/2404.13105},
author = {Montero, David and Aybar, César and Ji, Chaonan and Kraemer, Guido and S\"{o}chting, Maximilian and Teber, Khalil and Mahecha, Miguel D.},
keywords = {Databases (cs.DB), Computer Vision and Pattern Recognition (cs.CV), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {On-Demand Earth System Data Cubes},
publisher = {arXiv},
year = {2024},
copyright = {Creative Commons Attribution 4.0 International}
}
Logo Attribution
The logo and images were created using dice icons created by Freepik - Flaticon.
Owner
- Name: ESDS-Leipzig
- Login: ESDS-Leipzig
- Kind: organization
- Repositories: 1
- Profile: https://github.com/ESDS-Leipzig
Citation (CITATION.bib)
@article{montero2024cubo,
doi = {10.48550/ARXIV.2404.13105},
url = {https://arxiv.org/abs/2404.13105},
author = {Montero, David and Aybar, César and Ji, Chaonan and Kraemer, Guido and S\"{o}chting, Maximilian and Teber, Khalil and Mahecha, Miguel D.},
keywords = {Databases (cs.DB), Computer Vision and Pattern Recognition (cs.CV), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {On-Demand Earth System Data Cubes},
publisher = {arXiv},
year = {2024},
copyright = {Creative Commons Attribution 4.0 International}
}
GitHub Events
Total
- Watch event: 24
- Push event: 2
- Pull request event: 2
- Fork event: 7
Last Year
- Watch event: 24
- Push event: 2
- Pull request event: 2
- Fork event: 7
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 12
- Total pull requests: 5
- Average time to close issues: about 1 month
- Average time to close pull requests: 10 days
- Total issue authors: 10
- Total pull request authors: 3
- Average comments per issue: 1.42
- Average comments per pull request: 0.0
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: about 1 month
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- isConic (2)
- rbavery (1)
- suffenjoy (1)
- benjamindeneu (1)
- Sonicious (1)
- MartinuzziFrancesco (1)
- ryali93 (1)
- behzad89 (1)
- Frankie91 (1)
- 765302995 (1)
Pull Request Authors
- davemlz (6)
- maawoo (2)
- csaybar (1)

