planetary-datasets

Open source dataset loading and creation from Planetary Computer, GCP, and AWS. To support reproducible training of weather, energy, and mapping models.

https://github.com/jacobbieker/planetary-datasets

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.7%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

Open source dataset loading and creation from Planetary Computer, GCP, and AWS. To support reproducible training of weather, energy, and mapping models.

Basic Info
  • Host: GitHub
  • Owner: jacobbieker
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 349 KB
Statistics
  • Stars: 3
  • Watchers: 2
  • Forks: 0
  • Open Issues: 32
  • Releases: 1
Created over 2 years ago · Last pushed 8 months ago
Metadata Files
Readme License Citation

README.md

Planetary Datasets

Open source dataset loading and creation from Planetary Computer, GCP, and AWS. Built to support reproducible training of weather, energy, and mapping models.

This repository's goal is to make it easier to gather and use global and regional geospatial datasets for machine learning. It should provide a fairly simple way of getting data, converting it to Xarray, combining it with other datasets, and saving it to disk for training and inference. A lot of the data sources are from the Planetary Computer, but we also include data not available there from Google Cloud and AWS, or converted from other sources and uploaded to Hugging Face.

Examples

To get started, here are a few examples of using this library to fetch preprocessed datasets that are similar to ones used for training Google's MetNet regional forecasting model, and GraphCast global forecasting model.

Installation

bash pip install planetary-datasets

Usage

Processing data

To preprocess data (i.e. native to Zarr), one option is the Planetary Computer. To do that, you need to install kbatch and then, after signing in, run the following:

bash kbatch job submit -f pc/eumetsat-0deg.yaml

This will create a job that downloads EUMETSAT 0-Deg imagery, convert to zarr, and upload to Hugging Face. You will need to set the EUMETSAT API key and secret, and Hugging Face token for it to work.

Saving Processed data

Using the datasets

Citing

If you find this library useful, it would be great if you could cite the repo! There is the cite button on the side. Or you can use the below.

@software{Bieker_Planetary_Datasets_2023, author = {Bieker, Jacob}, month = feb, title = {{Planetary Datasets}}, url = {https://github.com/jacobbieker/planetary-datasets}, year = {2023} }

License

Owner

  • Name: Jacob Bieker
  • Login: jacobbieker
  • Kind: user
  • Location: United Kingdom
  • Company: @vida-place

Research Engineer at Vida. Previously Open Climate Fix. Interested in applications of AI to large-scale datasets, especially for astronomy.

Citation (CITATION)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Bieker"
  given-names: "Jacob"
  orcid: "https://orcid.org/0000-0003-0305-1893"
title: "Planetary Datasets"
version: 0.0.1
doi: 10.5281/zenodo.1234
date-released: 2023-02-04
url: "https://github.com/jacobbieker/planetary-datasets"

GitHub Events

Total
  • Issues event: 1
  • Watch event: 1
  • Push event: 97
Last Year
  • Issues event: 1
  • Watch event: 1
  • Push event: 97

Committers

Last synced: almost 2 years ago

All Time
  • Total Commits: 57
  • Total Committers: 1
  • Avg Commits per committer: 57.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 57
  • Committers: 1
  • Avg Commits per committer: 57.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Jacob Bieker j****b@b****h 57
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 34
  • Total pull requests: 0
  • Average time to close issues: 26 days
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 0.21
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • jacobbieker (30)
Pull Request Authors
Top Labels
Issue Labels
bug (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 5 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
pypi.org: planetary-datasets

Datasets for ML models for planetary use-cases (e.g. solar mapping, land use, forecasting)

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 5 Last month
Rankings
Dependent packages count: 7.5%
Average: 38.7%
Dependent repos count: 69.8%
Maintainers (1)
Last synced: 8 months ago

Dependencies

requirements.txt pypi
  • dask *
  • fsspec *
  • geojson *
  • geopandas *
  • odc-stac *
  • planetary_computer *
  • pyproj *
  • pystac-client *
  • rasterio *
  • rioxarray *
  • shapely *
setup.py pypi
  • odc-stac *
.github/workflows/release.yaml actions
.github/workflows/workflows.yaml actions
.github/workflows/plot_data.yaml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
  • actions/upload-artifact v4 composite
  • iterative/setup-cml v2 composite
Dockerfile docker
  • ubuntu latest build