earth2studio

Open-source deep-learning framework for exploring, building and deploying AI weather/climate workflows.

https://github.com/nvidia/earth2studio

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.5%) to scientific vocabulary

Keywords

ai climate-science deep-learning weather

Keywords from Contributors

physics nvidia-gpu distributed graph-computation agents mot interactive mesh cryptocurrencies optim
Last synced: 7 months ago · JSON representation ·

Repository

Open-source deep-learning framework for exploring, building and deploying AI weather/climate workflows.

Basic Info
Statistics
  • Stars: 242
  • Watchers: 8
  • Forks: 62
  • Open Issues: 20
  • Releases: 10
Topics
ai climate-science deep-learning weather
Created about 2 years ago · Last pushed 7 months ago
Metadata Files
Readme Changelog Contributing License Citation

README.md

# NVIDIA Earth2Studio [![python version][e2studio_python_img]][e2studio_python_url] [![license][e2studio_license_img]][e2studio_license_url] [![coverage][e2studio_cov_img]][e2studio_cov_url] [![mypy][e2studio_mypy_img]][e2studio_mypy_url] [![format][e2studio_format_img]][e2studio_format_url] [![ruff][e2studio_ruff_img]][e2studio_ruff_url] [![uv][e2studio_uv_img]][e2studio_uv_url] Earth2Studio is a Python-based package designed to get users up and running with AI Earth system models *fast*. Our mission is to enable everyone to build, research and explore AI driven weather and climate science. **- Earth2Studio Documentation -** [Install][e2studio_install_url] | [User-Guide][e2studio_userguide_url] | [Examples][e2studio_examples_url] | [API][e2studio_api_url] ![Earth2Studio Banner](https://huggingface.co/datasets/NickGeneva/Earth2StudioAssets/raw/main/0.2.0/earth2studio_feature_banner.png?id=1)

Quick start

Install Earth2Studio:

bash pip install earth2studio[dlwp]

Run a deterministic AI weather prediction in just a few lines of code:

```python from earth2studio.models.px import DLWP from earth2studio.data import GFS from earth2studio.io import NetCDF4Backend from earth2studio.run import deterministic as run

model = DLWP.loadmodel(DLWP.loaddefault_package()) ds = GFS() io = NetCDF4Backend("output.nc")

run(["2024-01-01"], 10, model, ds, io) ```

Swap out for a different AI model by just installing and replacing DLWP references with another forecast model.

Latest News

For a complete list of latest features and improvements see the changelog.

Overview

Earth2Studio is an AI inference pipeline toolkit focused on weather and climate applications that is designed to ride on top of different AI frameworks, model architectures, data sources and SciML tooling while providing a unified API.

![Earth2Studio Overview 1](https://huggingface.co/datasets/NickGeneva/Earth2StudioAssets/resolve/main/0.9.0/earth2studio-readme-overview-1.png?id=1)

The composability of the different core components in Earth2Studio easily allows the development and deployment of increasingly complex pipelines that may chain multiple data sources, AI models and other modules together.

![Earth2Studio Overview 1](https://huggingface.co/datasets/NickGeneva/Earth2StudioAssets/resolve/main/0.9.0/earth2studio-readme-overview-2.png?id=1)

The unified ecosystem of Earth2Studio provides users the opportunity to rapidly swap out components for alternatives. In addition to the largest model zoo of weather/climate AI models, Earth2Studio is packed with useful functionality such as optimized data access to cloud data stores, statistical operations and more to accelerate your pipelines.

![Earth2Studio Overview 1](https://huggingface.co/datasets/NickGeneva/Earth2StudioAssets/resolve/main/0.9.0/earth2studio-readme-overview-3.webp?id=1)

Earth2Studio can be used for seamless deployment of Earth-2 models trained in PhysicsNeMo.

Features

Earth2Studio package focuses on supplying users the tools to build their own workflows, pipelines, APIs, packages, etc. via modular components including:

Prognostic Models [Prognostic models][e2studio_px_url] in Earth2Studio perform time integration, taking atmospheric fields at a specific time and auto-regressively predicting the same fields into the future (typically 6 hours per step), enabling both single time-step predictions and extended time-series forecasting. Earth2Studio maintains the largest collection of pre-trained state-of-the-art AI weather/climate models ranging from global forecast models to regional specialized models, covering various resolutions, architectures, and forecasting capabilities to suit different computational and accuracy requirements. Available models include but are not limited to: | Model | Resolution | Architecture | Time Step | Coverage | |-------|------------|--------------|-----------|----------| | GraphCast Small | 1.0° | Graph Neural Network | 6h | Global | | GraphCast Operational | 0.25° | Graph Neural Network | 6h | Global | | Pangu 3hr | 0.25° | Transformer | 3h | Global | | Pangu 6hr | 0.25° | Transformer | 6h | Global | | Pangu 24hr | 0.25° | Transformer | 24h | Global | | Aurora | 0.25° | Transformer | 6h | Global | | FuXi | 0.25° | Transformer | 6h | Global | | AIFS | 0.25° | Transformer | 6h | Global | | StormCast | 3km | Diffusion + Regression | 1h | Regional (US) | | SFNO | 0.25° | Neural Operator | 6h | Global | | DLESyM | 0.25° | Convolutional | 6h | Global | For a complete list, see the [prognostic model API docs][e2studio_px_api].
Diagnostic Models [Diagnostic models][e2studio_dx_url] in Earth2Studio perform time-independent transformations, typically taking geospatial fields at a specific time and predicting new derived quantities without performing time integration enabling users to build pipelines to predict specific quantities of interest that may not be provided by forecasting models. Earth2Studio contains a growing collection of specialized diagnostic models for various phenomena including precipitation prediction, tropical cyclone tracking, solar radiation estimation, wind gust forecasting, and more. Available diagnostics include but are not limited to: | Model | Resolution | Architecture | Coverage | Output | |-------|------------|--------------|----------|--------| | PrecipitationAFNO | 0.25° | Neural Operator | Global | Total precipitation | | SolarRadiationAFNO1H | 0.25° | Neural Operator | Global | Surface solar radiation | | WindgustAFNO | 0.25° | AFNO | Global | Maximum wind gust | | TCTrackerVitart | 0.25° | Algorithmic | Global | TC tracks & properties | | CBottleInfill | 100km | Diffusion | Global | Global climate sample | | CBottleSR | 5km | Diffusion | Regional / Global | High-res climate | | CorrDiff | Variable | Diffusion | Regional | Fine-scale weather | | CorrDiffTaiwan | 2km | Diffusion | Regional (Taiwan) | Taiwan fine-scale weather | For a complete list, see the [diagnostic model API docs][e2studio_dx_api].
Datasources [Data sources][e2studio_data_url] in Earth2Studio provide a standardized API for accessing weather and climate datasets from various providers (numerical models, data assimilation results, and AI-generated data), enabling seamless integration of initial conditions for model inference and validation data for scoring across different data formats and storage systems. Earth2Studio includes data sources ranging from operational weather models (GFS, HRRR, IFS) and reanalysis datasets (ERA5 via ARCO, CDS) to AI-generated climate data (cBottle) and local file systems. Fetching data is just plain easy, Earth2Studio handles the complicated parts giving the users an easy to use Xarray data array of requested data under a shared package wide [vocabulary][e2studio_lex_url] and coordinate system. Available data sources include but are not limited to: | Data Source | Type | Resolution | Coverage | Data Format | |-------------|------|------------|----------|-------------| | GFS | Operational | 0.25° | Global | GRIB2 | | GFS_FX | Forecast | 0.25° | Global | GRIB2 | | HRRR | Operational | 3km | Regional (US) | GRIB2 | | HRRR_FX | Forecast | 3km | Regional (US) | GRIB2 | | ARCO ERA5 | Reanalysis | 0.25° | Global | Zarr | | CDS | Reanalysis | 0.25° | Global | NetCDF | | IFS | Operational | 0.25° | Global | GRIB2 | | NCAR_ERA5 | Reanalysis | 0.25° | Global | NetCDF | | WeatherBench2 | Reanalysis | 0.25° | Global | Zarr | | GEFS_FX | Ensemble Forecast | 0.25° | Global | GRIB2 | | IMERG | Precipitation | 0.1° | Global | NetCDF | | CBottle3D | AI Generated | 100km | Global | HEALPix | For a complete list, see the [data source API docs][e2studio_data_api].
IO Backends [IO backends][e2studio_io_url] in Earth2Studio provides a standardized interface for writing and storing pipeline outputs across different file formats and storage systems enabling users to store inference outputs for later processing. Earth2Studio includes IO backends ranging from traditional scientific formats (NetCDF) and modern cloud-optimized formats (Zarr) to in-memory storage backends. Available IO backends include: | IO Backend | Format | Features | Location | |------------|--------|----------|----------| | ZarrBackend | Zarr | Compression, Chunking | In-Memory/Local | | AsyncZarrBackend | Zarr | Async writes, Parallel I/O | In-Memory/Local/Remote | | NetCDF4Backend | NetCDF4 | CF-compliant, Metadata | In-Memory/Local | | XarrayBackend | Xarray Dataset | Rich metadata, Analysis-ready | In-Memory | | KVBackend | Key-Value| Fast Temporary Access | In-Memory | For a complete list, see the [IO API docs][e2studio_io_api].
Perturbation Methods [Perturbation methods][e2studio_pb_url] in Earth2Studio provide a standardized interface for adding noise to data arrays, typically enabling the creation of ensembling forecast pipelines that capture uncertainty in weather and climate predictions. Available perturbations include but are not limited to: | Perturbation Method | Type | Spatial Correlation | Temporal Correlation | |---------------------|------|-------------------|---------------------| | Gaussian | Noise | None | None | | Correlated SphericalGaussian | Noise | Spherical | AR(1) process | | Spherical Gaussian | Noise | Spherical (Matern) | None | | Brown | Noise | 2D Fourier | None | | Bred Vector | Dynamical | Model-dependent | Model-dependent | | Hemispheric Centred Bred Vector | Dynamical | Hemispheric | Model-dependent | For a complete list, see the [perturbations API docs][e2studio_pb_url].
Statistics / Metrics [Statistics and metrics][e2studio_stat_url] in Earth2Studio provide operations typically useful for in-pipeline evaluation of forecast performance across different dimensions (spatial, temporal, ensemble) through various statistical measures including error metrics, correlation coefficients, and ensemble verification statistics. Available operations include but are not limited to: | Statistic | Type | Application | |-----------|------|-------------| | RMSE | Error Metric | Forecast accuracy | | ACC | Correlation | Pattern correlation | | CRPS | Ensemble Metric | Probabilistic skill | | Rank Histogram | Ensemble Metric | Ensemble reliability | | Standard Deviation | Moment | Spread measure | | Spread-Skill Ratio | Ensemble Metric | Ensemble calibration | For a complete list, see the [statistics API docs][e2studio_stat_api].

For a more complete list of features, be sure to view the documentation. Don't see what you need? Great news, extension and customization are at the heart of our design.

Contributors

Check out the contributing document for details about the technical requirements and the userguide for higher level philosophy, structure, and design.

License

Earth2Studio is provided under the Apache License 2.0, please see the LICENSE file for full license text.

Owner

  • Name: NVIDIA Corporation
  • Login: NVIDIA
  • Kind: organization
  • Location: 2788 San Tomas Expressway, Santa Clara, CA, 95051

Citation (CITATION.cff)

cff-version: 1.2.0
message: If you use this software, please cite it as below.
title: NVIDIA Earth2Studio
authors:
  - family-names: Geneva
    given-names: Nicholas
    orcid: https://orcid.org/0000-0003-4562-459X
  - family-names: Foster
    given-names: Dallas
    orcid: https://orcid.org/0000-0001-8459-9767
url: https://github.com/NVIDIA/earth2studio
repository-code: https://github.com/NVIDIA/earth2studio
date-released: 2024-04-22

GitHub Events

Total
  • Create event: 21
  • Release event: 6
  • Issues event: 161
  • Watch event: 125
  • Delete event: 13
  • Member event: 6
  • Issue comment event: 749
  • Push event: 249
  • Pull request review event: 405
  • Pull request review comment event: 354
  • Pull request event: 314
  • Fork event: 28
Last Year
  • Create event: 21
  • Release event: 6
  • Issues event: 161
  • Watch event: 126
  • Delete event: 13
  • Member event: 6
  • Issue comment event: 749
  • Push event: 249
  • Pull request review event: 408
  • Pull request review comment event: 355
  • Pull request event: 314
  • Fork event: 28

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 283
  • Total Committers: 19
  • Avg Commits per committer: 14.895
  • Development Distribution Score (DDS): 0.314
Past Year
  • Commits: 197
  • Committers: 17
  • Avg Commits per committer: 11.588
  • Development Distribution Score (DDS): 0.31
Top Committers
Name Email Commits
Nicholas Geneva 5****a 194
Dallas Foster d****f@n****m 30
Peter Harrington 4****n 10
Oliver Hennigh l****1@g****m 9
Marius 2****s 7
gertln g****l@n****m 7
dependabot[bot] 4****] 6
Jussi Leinonen j****n@n****m 3
Stefan Weissenberger s****g@n****m 3
Rodrigo Almeida r****4@o****t 2
Sai Krishnan Chandrasekar 1****v 2
Emmanuel Ferdman e****n@g****m 2
Akshay Subramaniam 6****r 2
Alberto Carpentieri 5****i 1
Kaustubh Tangsali 7****i 1
Luke Conibear 1****r 1
Manas Sahni s****s@g****m 1
Sean Lee 1****e 1
ivanauyeung 1****g 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 150
  • Total pull requests: 521
  • Average time to close issues: 13 days
  • Average time to close pull requests: 3 days
  • Total issue authors: 33
  • Total pull request authors: 20
  • Average comments per issue: 0.92
  • Average comments per pull request: 2.95
  • Merged pull requests: 418
  • Bot issues: 0
  • Bot pull requests: 23
Past Year
  • Issues: 116
  • Pull requests: 367
  • Average time to close issues: 14 days
  • Average time to close pull requests: 3 days
  • Issue authors: 31
  • Pull request authors: 17
  • Average comments per issue: 0.99
  • Average comments per pull request: 3.02
  • Merged pull requests: 280
  • Bot issues: 0
  • Bot pull requests: 23
Top Authors
Issue Authors
  • NickGeneva (74)
  • swbg (12)
  • mariusaurus (10)
  • gertln (6)
  • dallasfoster (5)
  • jleinonen (5)
  • mike-scchen (5)
  • rodrigoalmeida94 (3)
  • luke-conibear (3)
  • david5010 (2)
  • pzharrington (2)
  • bfouquet (2)
  • awesomemfg (1)
  • juliusberner (1)
  • meteoDaniel (1)
Pull Request Authors
  • NickGeneva (342)
  • dallasfoster (38)
  • loliverhennigh (24)
  • dependabot[bot] (23)
  • gertln (19)
  • mariusaurus (16)
  • pzharrington (16)
  • jleinonen (12)
  • rodrigoalmeida94 (5)
  • swbg (4)
  • akshaysubr (4)
  • albertocarpentieri (3)
  • saikrishnanc-nv (3)
  • ivanauyeung (2)
  • SeanSBLee (2)
Top Labels
Issue Labels
bug (85) enhancement (45) ? - Needs Triage (43) documentation (22) 2 - In Progress (9) 1 - On Deck (6) 0 - Backlog (5) question (4) wontfix (1) ! - Release (1) dependencies (1)
Pull Request Labels
4 - In Review (23) dependencies (23) python (23) 2 - In Progress (19) 3 - Ready for Review (15) 1 - On Deck (8) ! - Release (6) enhancement (1) 5 - DO NOT MERGE (1) bug (1)