intake-esm

An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.

https://github.com/intake/intake-esm

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    10 of 25 committers (40.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.3%) to scientific vocabulary

Keywords

cesm-lens climate-datasets cmip6 data-access data-catalog earth-system-model hacktoberfest intake pangeo

Keywords from Contributors

climate climate-model climate-analysis climate-science meteorology hydrology particles meshes geoscience verification
Last synced: 6 months ago · JSON representation

Repository

An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.

Basic Info
Statistics
  • Stars: 154
  • Watchers: 14
  • Forks: 50
  • Open Issues: 44
  • Releases: 28
Topics
cesm-lens climate-datasets cmip6 data-access data-catalog earth-system-model hacktoberfest intake pangeo
Created about 7 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Codeowners

README.md

Intake-esm

Badges

| CI | GitHub Workflow Status Code Coverage Status pre-commit.ci status | | :----------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | | Docs | Documentation Status | | Package | Conda PyPI Versions | | License | License | | Citation | Zenodo |

Motivation

Computer simulations of the Earth’s climate and weather generate huge amounts of data. These data are often persisted on HPC systems or in the cloud across multiple data assets of a variety of formats (netCDF, zarr, etc...). Finding, investigating, loading these data assets into compute-ready data containers costs time and effort. The data user needs to know what data sets are available, the attributes describing each data set, before loading a specific data set and analyzing it.

Finding, investigating, loading these assets into data array containers such as xarray can be a daunting task due to the large number of files a user may be interested in. Intake-esm aims to address these issues by providing necessary functionality for searching, discovering, data access/loading.

Overview

intake-esm is a data cataloging utility built on top of intake, pandas, polars and xarray, and it's pretty awesome!

  • Opening an ESM catalog definition file: An Earth System Model (ESM) catalog file is a JSON file that conforms to the ESM Collection Specification. When provided a link/path to an esm catalog file, intake-esm establishes a link to a database (CSV file) that contains data assets locations and associated metadata (i.e., which experiment, model, the come from). The catalog JSON file can be stored on a local filesystem or can be hosted on a remote server.

```python

In [1]: import intake

In [2]: import intake_esm

In [3]: caturl = intakeesm.tutorial.geturl("googlecmip6")

In [4]: cat = intake.openesmdatastore(cat_url)

In [5]: cat Out[5]: ```

  • Search and Discovery: intake-esm provides functionality to execute queries against the catalog:

```python In [5]: catsubset = cat.search( ...: experimentid=["historical", "ssp585"], ...: tableid="Oyr", ...: variableid="o2", ...: grid_label="gn", ...: )

In [6]: cat_subset Out[6]: ```

  • Access: when the user is satisfied with the results of their query, they can load data assets (netCDF and/or Zarr stores) into xarray datasets:

```python

In [7]: dset_dict = cat_subset.to_dataset_dict()

--> The keys in the returned dictionary of datasets are constructed as follows:
        'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'
|███████████████████████████████████████████████████████████████| 100.00% [2/2 00:18<00:00]

```

See documentation for more information.

Installation

Intake-esm can be installed from PyPI with pip:

bash python -m pip install intake-esm

It is also available from conda-forge for conda installations:

bash conda install -c conda-forge intake-esm

Owner

  • Name: Intake
  • Login: intake
  • Kind: organization
  • Email: intakedev@gmail.com

Taking the pain out of data access and distribution

GitHub Events

Total
  • Create event: 34
  • Release event: 2
  • Issues event: 25
  • Watch event: 16
  • Delete event: 29
  • Member event: 3
  • Issue comment event: 87
  • Push event: 143
  • Pull request review comment event: 6
  • Pull request review event: 22
  • Pull request event: 69
  • Fork event: 5
Last Year
  • Create event: 34
  • Release event: 2
  • Issues event: 25
  • Watch event: 16
  • Delete event: 29
  • Member event: 3
  • Issue comment event: 87
  • Push event: 143
  • Pull request review comment event: 6
  • Pull request review event: 22
  • Pull request event: 69
  • Fork event: 5

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 1,025
  • Total Committers: 25
  • Avg Commits per committer: 41.0
  • Development Distribution Score (DDS): 0.285
Past Year
  • Commits: 41
  • Committers: 9
  • Avg Commits per committer: 4.556
  • Development Distribution Score (DDS): 0.707
Top Committers
Name Email Commits
Anderson Banihirwe a****e@u****u 733
mclong m****g@u****u 77
pre-commit-ci[bot] 6****] 68
dependabot[bot] 4****] 48
Max Grover m****x@g****m 19
bonnland b****d@u****u 14
Julia Kent 4****t 13
Charles Turner 5****1 12
Pascal Bourgault b****l@o****a 12
AS a****g@m****e 4
Paul Branson b****7@o****u 4
Dougie Squire 4****e 4
Sadie L. Bartholomew s****w@n****k 2
Trevor James Smith 1****e 2
jbusecke j****s@l****u 2
Paul Branson p****n@c****u 2
Aaron Spring a****g 1
Hauke Schulz 4****s 1
Jared Lewis j****d@j****z 1
Romain Beucher r****r@a****u 1
RondeauG 3****G 1
Sebastián Blanco s****g@e****m 1
Joseph Hamman j****n@u****u 1
Tobias Kölling t****i@d****e 1
garciampred 9****d 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 234
  • Total pull requests: 526
  • Average time to close issues: 3 months
  • Average time to close pull requests: 5 days
  • Total issue authors: 67
  • Total pull request authors: 28
  • Average comments per issue: 3.55
  • Average comments per pull request: 0.95
  • Merged pull requests: 483
  • Bot issues: 1
  • Bot pull requests: 151
Past Year
  • Issues: 16
  • Pull requests: 80
  • Average time to close issues: 25 days
  • Average time to close pull requests: 6 days
  • Issue authors: 8
  • Pull request authors: 8
  • Average comments per issue: 2.13
  • Average comments per pull request: 0.75
  • Merged pull requests: 66
  • Bot issues: 1
  • Bot pull requests: 37
Top Authors
Issue Authors
  • andersy005 (42)
  • jbusecke (23)
  • matt-long (19)
  • aulemahal (17)
  • charles-turner-1 (9)
  • ahuang11 (8)
  • aaronspring (7)
  • mgrover1 (7)
  • naomi-henderson (7)
  • wachsylon (6)
  • dougiesquire (5)
  • aradhakrishnanGFDL (4)
  • fanchic (4)
  • jukent (4)
  • RondeauG (3)
Pull Request Authors
  • andersy005 (243)
  • pre-commit-ci[bot] (85)
  • dependabot[bot] (66)
  • charles-turner-1 (31)
  • mgrover1 (20)
  • matt-long (17)
  • jukent (14)
  • aulemahal (13)
  • dougiesquire (4)
  • Zeitsperre (4)
  • sadielbartholomew (4)
  • aaronspring (3)
  • bonnland (3)
  • garciampred (2)
  • rbeucher (2)
Top Labels
Issue Labels
enhancement (41) bug (33) usage question (28) documentation (7) discuss (3) good first issue (3) needs triage (3) feature (2) awaiting more information (2) upstream issue (2) help wanted (1) dependencies (1)
Pull Request Labels
dependencies (77) maintenance (51) enhancement (36) CI (27) documentation (23) usage question (14) bug-fix (13) feature (6) github_actions (6) internal-change (5)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 16,601 last-month
  • Total docker downloads: 1,563
  • Total dependent packages: 18
    (may contain duplicates)
  • Total dependent repositories: 86
    (may contain duplicates)
  • Total versions: 47
  • Total maintainers: 3
pypi.org: intake-esm

An intake plugin for parsing an Earth System Model (ESM) catalog and loading netCDF files and/or Zarr stores into Xarray datasets.

  • Versions: 28
  • Dependent Packages: 15
  • Dependent Repositories: 22
  • Downloads: 16,601 Last month
  • Docker Downloads: 1,563
Rankings
Dependent packages count: 1.6%
Downloads: 1.9%
Docker downloads count: 2.3%
Dependent repos count: 3.1%
Average: 3.6%
Forks count: 6.3%
Stargazers count: 6.6%
Maintainers (3)
Last synced: 6 months ago
conda-forge.org: intake-esm

An intake plugin for parsing an Earth System Model (ESM) collection/catalog and loading assets (netCDF files and/or Zarr stores) into xarray data sets.

  • Versions: 19
  • Dependent Packages: 3
  • Dependent Repositories: 64
Rankings
Dependent repos count: 4.4%
Dependent packages count: 15.6%
Average: 20.3%
Forks count: 28.4%
Stargazers count: 32.9%
Last synced: 6 months ago

Dependencies

requirements.txt pypi
  • dask >=2021.9
  • fastprogress >=1.0.0
  • fsspec >=2021.7.0
  • intake >=0.6.5
  • netCDF4 >=1.5.5
  • pydantic >=1.8.2
  • requests >=2.24.0
  • xarray >=0.19,
  • xcollection *
  • zarr >=2.5
.github/workflows/ci.yaml actions
  • actions/checkout v3 composite
  • codecov/codecov-action v3.1.1 composite
  • mamba-org/provision-with-micromamba main composite
.github/workflows/pypi.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • pypa/gh-action-pypi-publish v1.6.4 composite
pyproject.toml pypi
ci/environment.yml conda
  • cftime
  • codecov
  • fastprogress >=1.0.0
  • fsspec >=2022.11.0
  • gcsfs >=2022.11.0
  • h5netcdf >=0.8.1
  • intake <2.0
  • ipython
  • matplotlib
  • netcdf4 >=1.5.5
  • pandas >=2.1.0
  • pip
  • pooch
  • pre-commit
  • pydantic >=2.0
  • pydap
  • pytest
  • pytest-cov
  • pytest-mock
  • pytest-sugar
  • pytest-xdist
  • s3fs >=2022.11.0
  • scipy
  • xarray >=2022.06
  • xarray-datatree
  • zarr >=2.12