planetary-datasets
Open source dataset loading and creation from Planetary Computer, GCP, and AWS. To support reproducible training of weather, energy, and mapping models.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.7%) to scientific vocabulary
Repository
Open source dataset loading and creation from Planetary Computer, GCP, and AWS. To support reproducible training of weather, energy, and mapping models.
Basic Info
- Host: GitHub
- Owner: jacobbieker
- License: mit
- Language: Python
- Default Branch: main
- Size: 349 KB
Statistics
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 32
- Releases: 1
Metadata Files
README.md
Planetary Datasets
Open source dataset loading and creation from Planetary Computer, GCP, and AWS. Built to support reproducible training of weather, energy, and mapping models.
This repository's goal is to make it easier to gather and use global and regional geospatial datasets for machine learning. It should provide a fairly simple way of getting data, converting it to Xarray, combining it with other datasets, and saving it to disk for training and inference. A lot of the data sources are from the Planetary Computer, but we also include data not available there from Google Cloud and AWS, or converted from other sources and uploaded to Hugging Face.
Examples
To get started, here are a few examples of using this library to fetch preprocessed datasets that are similar to ones used for training Google's MetNet regional forecasting model, and GraphCast global forecasting model.
Installation
bash
pip install planetary-datasets
Usage
Processing data
To preprocess data (i.e. native to Zarr), one option is the Planetary Computer.
To do that, you need to install kbatch and then, after signing in, run the following:
bash
kbatch job submit -f pc/eumetsat-0deg.yaml
This will create a job that downloads EUMETSAT 0-Deg imagery, convert to zarr, and upload to Hugging Face. You will need to set the EUMETSAT API key and secret, and Hugging Face token for it to work.
Saving Processed data
Using the datasets
Citing
If you find this library useful, it would be great if you could cite the repo! There is the cite button on the side. Or you can use the below.
@software{Bieker_Planetary_Datasets_2023,
author = {Bieker, Jacob},
month = feb,
title = {{Planetary Datasets}},
url = {https://github.com/jacobbieker/planetary-datasets},
year = {2023}
}
License
Owner
- Name: Jacob Bieker
- Login: jacobbieker
- Kind: user
- Location: United Kingdom
- Company: @vida-place
- Website: https://www.jacobbieker.com
- Twitter: JacobBieker
- Repositories: 99
- Profile: https://github.com/jacobbieker
Research Engineer at Vida. Previously Open Climate Fix. Interested in applications of AI to large-scale datasets, especially for astronomy.
Citation (CITATION)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Bieker" given-names: "Jacob" orcid: "https://orcid.org/0000-0003-0305-1893" title: "Planetary Datasets" version: 0.0.1 doi: 10.5281/zenodo.1234 date-released: 2023-02-04 url: "https://github.com/jacobbieker/planetary-datasets"
GitHub Events
Total
- Issues event: 1
- Watch event: 1
- Push event: 97
Last Year
- Issues event: 1
- Watch event: 1
- Push event: 97
Committers
Last synced: almost 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Jacob Bieker | j****b@b****h | 57 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 34
- Total pull requests: 0
- Average time to close issues: 26 days
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.21
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- jacobbieker (30)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 5 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 1
- Total maintainers: 1
pypi.org: planetary-datasets
Datasets for ML models for planetary use-cases (e.g. solar mapping, land use, forecasting)
- Homepage: https://github.com/jacobbieker/planetary-datasets
- Documentation: https://planetary-datasets.readthedocs.io/
- License: mit
-
Latest release: 0.0.1
published over 2 years ago
Rankings
Maintainers (1)
Dependencies
- dask *
- fsspec *
- geojson *
- geopandas *
- odc-stac *
- planetary_computer *
- pyproj *
- pystac-client *
- rasterio *
- rioxarray *
- shapely *
- odc-stac *
- actions/checkout v4 composite
- actions/setup-python v5 composite
- actions/upload-artifact v4 composite
- iterative/setup-cml v2 composite
- ubuntu latest build