earth2studio
Open-source deep-learning framework for exploring, building and deploying AI weather/climate workflows.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.5%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Open-source deep-learning framework for exploring, building and deploying AI weather/climate workflows.
Basic Info
- Host: GitHub
- Owner: NVIDIA
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://nvidia.github.io/earth2studio/
- Size: 310 MB
Statistics
- Stars: 242
- Watchers: 8
- Forks: 62
- Open Issues: 20
- Releases: 10
Topics
Metadata Files
README.md
Quick start
Install Earth2Studio:
bash
pip install earth2studio[dlwp]
Run a deterministic AI weather prediction in just a few lines of code:
```python from earth2studio.models.px import DLWP from earth2studio.data import GFS from earth2studio.io import NetCDF4Backend from earth2studio.run import deterministic as run
model = DLWP.loadmodel(DLWP.loaddefault_package()) ds = GFS() io = NetCDF4Backend("output.nc")
run(["2024-01-01"], 10, model, ds, io) ```
Swap out for a different AI model by just installing
and replacing DLWP references with another forecast model.
Latest News
- The latest Climate in a Bottle generative AI model from NVIDIA research has been added via several APIs including a data source, infilling and super-resolution APIs. See the cBottle examples for more.
- The long awaited GraphCast 1 degree prognostic model and GraphCast Operational prognostic model are now added.
- Advanced Subseasonal-to-Seasonal (S2S) forecasting recipe added demonstrating new inference pipelines for subseasonal weather forecasts (from 2 weeks to 3 months).
For a complete list of latest features and improvements see the changelog.
Overview
Earth2Studio is an AI inference pipeline toolkit focused on weather and climate applications that is designed to ride on top of different AI frameworks, model architectures, data sources and SciML tooling while providing a unified API.
The composability of the different core components in Earth2Studio easily allows the development and deployment of increasingly complex pipelines that may chain multiple data sources, AI models and other modules together.
The unified ecosystem of Earth2Studio provides users the opportunity to rapidly swap out components for alternatives. In addition to the largest model zoo of weather/climate AI models, Earth2Studio is packed with useful functionality such as optimized data access to cloud data stores, statistical operations and more to accelerate your pipelines.
Earth2Studio can be used for seamless deployment of Earth-2 models trained in PhysicsNeMo.
Features
Earth2Studio package focuses on supplying users the tools to build their own workflows, pipelines, APIs, packages, etc. via modular components including:
Prognostic Models
[Prognostic models][e2studio_px_url] in Earth2Studio perform time integration, taking atmospheric fields at a specific time and auto-regressively predicting the same fields into the future (typically 6 hours per step), enabling both single time-step predictions and extended time-series forecasting. Earth2Studio maintains the largest collection of pre-trained state-of-the-art AI weather/climate models ranging from global forecast models to regional specialized models, covering various resolutions, architectures, and forecasting capabilities to suit different computational and accuracy requirements. Available models include but are not limited to: | Model | Resolution | Architecture | Time Step | Coverage | |-------|------------|--------------|-----------|----------| | GraphCast Small | 1.0° | Graph Neural Network | 6h | Global | | GraphCast Operational | 0.25° | Graph Neural Network | 6h | Global | | Pangu 3hr | 0.25° | Transformer | 3h | Global | | Pangu 6hr | 0.25° | Transformer | 6h | Global | | Pangu 24hr | 0.25° | Transformer | 24h | Global | | Aurora | 0.25° | Transformer | 6h | Global | | FuXi | 0.25° | Transformer | 6h | Global | | AIFS | 0.25° | Transformer | 6h | Global | | StormCast | 3km | Diffusion + Regression | 1h | Regional (US) | | SFNO | 0.25° | Neural Operator | 6h | Global | | DLESyM | 0.25° | Convolutional | 6h | Global | For a complete list, see the [prognostic model API docs][e2studio_px_api].Diagnostic Models
[Diagnostic models][e2studio_dx_url] in Earth2Studio perform time-independent transformations, typically taking geospatial fields at a specific time and predicting new derived quantities without performing time integration enabling users to build pipelines to predict specific quantities of interest that may not be provided by forecasting models. Earth2Studio contains a growing collection of specialized diagnostic models for various phenomena including precipitation prediction, tropical cyclone tracking, solar radiation estimation, wind gust forecasting, and more. Available diagnostics include but are not limited to: | Model | Resolution | Architecture | Coverage | Output | |-------|------------|--------------|----------|--------| | PrecipitationAFNO | 0.25° | Neural Operator | Global | Total precipitation | | SolarRadiationAFNO1H | 0.25° | Neural Operator | Global | Surface solar radiation | | WindgustAFNO | 0.25° | AFNO | Global | Maximum wind gust | | TCTrackerVitart | 0.25° | Algorithmic | Global | TC tracks & properties | | CBottleInfill | 100km | Diffusion | Global | Global climate sample | | CBottleSR | 5km | Diffusion | Regional / Global | High-res climate | | CorrDiff | Variable | Diffusion | Regional | Fine-scale weather | | CorrDiffTaiwan | 2km | Diffusion | Regional (Taiwan) | Taiwan fine-scale weather | For a complete list, see the [diagnostic model API docs][e2studio_dx_api].Datasources
[Data sources][e2studio_data_url] in Earth2Studio provide a standardized API for accessing weather and climate datasets from various providers (numerical models, data assimilation results, and AI-generated data), enabling seamless integration of initial conditions for model inference and validation data for scoring across different data formats and storage systems. Earth2Studio includes data sources ranging from operational weather models (GFS, HRRR, IFS) and reanalysis datasets (ERA5 via ARCO, CDS) to AI-generated climate data (cBottle) and local file systems. Fetching data is just plain easy, Earth2Studio handles the complicated parts giving the users an easy to use Xarray data array of requested data under a shared package wide [vocabulary][e2studio_lex_url] and coordinate system. Available data sources include but are not limited to: | Data Source | Type | Resolution | Coverage | Data Format | |-------------|------|------------|----------|-------------| | GFS | Operational | 0.25° | Global | GRIB2 | | GFS_FX | Forecast | 0.25° | Global | GRIB2 | | HRRR | Operational | 3km | Regional (US) | GRIB2 | | HRRR_FX | Forecast | 3km | Regional (US) | GRIB2 | | ARCO ERA5 | Reanalysis | 0.25° | Global | Zarr | | CDS | Reanalysis | 0.25° | Global | NetCDF | | IFS | Operational | 0.25° | Global | GRIB2 | | NCAR_ERA5 | Reanalysis | 0.25° | Global | NetCDF | | WeatherBench2 | Reanalysis | 0.25° | Global | Zarr | | GEFS_FX | Ensemble Forecast | 0.25° | Global | GRIB2 | | IMERG | Precipitation | 0.1° | Global | NetCDF | | CBottle3D | AI Generated | 100km | Global | HEALPix | For a complete list, see the [data source API docs][e2studio_data_api].IO Backends
[IO backends][e2studio_io_url] in Earth2Studio provides a standardized interface for writing and storing pipeline outputs across different file formats and storage systems enabling users to store inference outputs for later processing. Earth2Studio includes IO backends ranging from traditional scientific formats (NetCDF) and modern cloud-optimized formats (Zarr) to in-memory storage backends. Available IO backends include: | IO Backend | Format | Features | Location | |------------|--------|----------|----------| | ZarrBackend | Zarr | Compression, Chunking | In-Memory/Local | | AsyncZarrBackend | Zarr | Async writes, Parallel I/O | In-Memory/Local/Remote | | NetCDF4Backend | NetCDF4 | CF-compliant, Metadata | In-Memory/Local | | XarrayBackend | Xarray Dataset | Rich metadata, Analysis-ready | In-Memory | | KVBackend | Key-Value| Fast Temporary Access | In-Memory | For a complete list, see the [IO API docs][e2studio_io_api].Perturbation Methods
[Perturbation methods][e2studio_pb_url] in Earth2Studio provide a standardized interface for adding noise to data arrays, typically enabling the creation of ensembling forecast pipelines that capture uncertainty in weather and climate predictions. Available perturbations include but are not limited to: | Perturbation Method | Type | Spatial Correlation | Temporal Correlation | |---------------------|------|-------------------|---------------------| | Gaussian | Noise | None | None | | Correlated SphericalGaussian | Noise | Spherical | AR(1) process | | Spherical Gaussian | Noise | Spherical (Matern) | None | | Brown | Noise | 2D Fourier | None | | Bred Vector | Dynamical | Model-dependent | Model-dependent | | Hemispheric Centred Bred Vector | Dynamical | Hemispheric | Model-dependent | For a complete list, see the [perturbations API docs][e2studio_pb_url].Statistics / Metrics
[Statistics and metrics][e2studio_stat_url] in Earth2Studio provide operations typically useful for in-pipeline evaluation of forecast performance across different dimensions (spatial, temporal, ensemble) through various statistical measures including error metrics, correlation coefficients, and ensemble verification statistics. Available operations include but are not limited to: | Statistic | Type | Application | |-----------|------|-------------| | RMSE | Error Metric | Forecast accuracy | | ACC | Correlation | Pattern correlation | | CRPS | Ensemble Metric | Probabilistic skill | | Rank Histogram | Ensemble Metric | Ensemble reliability | | Standard Deviation | Moment | Spread measure | | Spread-Skill Ratio | Ensemble Metric | Ensemble calibration | For a complete list, see the [statistics API docs][e2studio_stat_api].For a more complete list of features, be sure to view the documentation. Don't see what you need? Great news, extension and customization are at the heart of our design.
Contributors
Check out the contributing document for details about the technical requirements and the userguide for higher level philosophy, structure, and design.
License
Earth2Studio is provided under the Apache License 2.0, please see the LICENSE file for full license text.
Owner
- Name: NVIDIA Corporation
- Login: NVIDIA
- Kind: organization
- Location: 2788 San Tomas Expressway, Santa Clara, CA, 95051
- Website: https://nvidia.com
- Repositories: 342
- Profile: https://github.com/NVIDIA
Citation (CITATION.cff)
cff-version: 1.2.0
message: If you use this software, please cite it as below.
title: NVIDIA Earth2Studio
authors:
- family-names: Geneva
given-names: Nicholas
orcid: https://orcid.org/0000-0003-4562-459X
- family-names: Foster
given-names: Dallas
orcid: https://orcid.org/0000-0001-8459-9767
url: https://github.com/NVIDIA/earth2studio
repository-code: https://github.com/NVIDIA/earth2studio
date-released: 2024-04-22
GitHub Events
Total
- Create event: 21
- Release event: 6
- Issues event: 161
- Watch event: 125
- Delete event: 13
- Member event: 6
- Issue comment event: 749
- Push event: 249
- Pull request review event: 405
- Pull request review comment event: 354
- Pull request event: 314
- Fork event: 28
Last Year
- Create event: 21
- Release event: 6
- Issues event: 161
- Watch event: 126
- Delete event: 13
- Member event: 6
- Issue comment event: 749
- Push event: 249
- Pull request review event: 408
- Pull request review comment event: 355
- Pull request event: 314
- Fork event: 28
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Nicholas Geneva | 5****a | 194 |
| Dallas Foster | d****f@n****m | 30 |
| Peter Harrington | 4****n | 10 |
| Oliver Hennigh | l****1@g****m | 9 |
| Marius | 2****s | 7 |
| gertln | g****l@n****m | 7 |
| dependabot[bot] | 4****] | 6 |
| Jussi Leinonen | j****n@n****m | 3 |
| Stefan Weissenberger | s****g@n****m | 3 |
| Rodrigo Almeida | r****4@o****t | 2 |
| Sai Krishnan Chandrasekar | 1****v | 2 |
| Emmanuel Ferdman | e****n@g****m | 2 |
| Akshay Subramaniam | 6****r | 2 |
| Alberto Carpentieri | 5****i | 1 |
| Kaustubh Tangsali | 7****i | 1 |
| Luke Conibear | 1****r | 1 |
| Manas Sahni | s****s@g****m | 1 |
| Sean Lee | 1****e | 1 |
| ivanauyeung | 1****g | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 150
- Total pull requests: 521
- Average time to close issues: 13 days
- Average time to close pull requests: 3 days
- Total issue authors: 33
- Total pull request authors: 20
- Average comments per issue: 0.92
- Average comments per pull request: 2.95
- Merged pull requests: 418
- Bot issues: 0
- Bot pull requests: 23
Past Year
- Issues: 116
- Pull requests: 367
- Average time to close issues: 14 days
- Average time to close pull requests: 3 days
- Issue authors: 31
- Pull request authors: 17
- Average comments per issue: 0.99
- Average comments per pull request: 3.02
- Merged pull requests: 280
- Bot issues: 0
- Bot pull requests: 23
Top Authors
Issue Authors
- NickGeneva (74)
- swbg (12)
- mariusaurus (10)
- gertln (6)
- dallasfoster (5)
- jleinonen (5)
- mike-scchen (5)
- rodrigoalmeida94 (3)
- luke-conibear (3)
- david5010 (2)
- pzharrington (2)
- bfouquet (2)
- awesomemfg (1)
- juliusberner (1)
- meteoDaniel (1)
Pull Request Authors
- NickGeneva (342)
- dallasfoster (38)
- loliverhennigh (24)
- dependabot[bot] (23)
- gertln (19)
- mariusaurus (16)
- pzharrington (16)
- jleinonen (12)
- rodrigoalmeida94 (5)
- swbg (4)
- akshaysubr (4)
- albertocarpentieri (3)
- saikrishnanc-nv (3)
- ivanauyeung (2)
- SeanSBLee (2)