deepsensor

A Python package for tackling diverse environmental prediction tasks with NPs.

https://github.com/alan-turing-institute/deepsensor

Science Score: 75.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    3 of 13 committers (23.1%) from academic institutions
  • Institutional organization owner
    Organization alan-turing-institute has institutional domain (turing.ac.uk)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.2%) to scientific vocabulary

Keywords from Contributors

hut23 gaussian-processes mcmc uncertainty-quantification ai-governance hut23-958 hut23-1205 degoogle
Last synced: 6 months ago · JSON representation ·

Repository

A Python package for tackling diverse environmental prediction tasks with NPs.

Basic Info
Statistics
  • Stars: 115
  • Watchers: 6
  • Forks: 20
  • Open Issues: 17
  • Releases: 30
Created almost 3 years ago · Last pushed 9 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

    A Python package and open-source project for modelling environmental data with neural processes


    release Latest Docs Tests Coverage Status Code style: black slack All Contributors License: MIT

    DeepSensor streamlines the application of neural processes (NPs) to environmental sciences by providing a simple interface for building, training, and evaluating NPs using xarray and pandas data. Our developers and users form an open-source community whose vision is to accelerate the next generation of environmental ML research. The DeepSensor Python package facilitates this by drastically reducing the time and effort required to apply NPs to environmental prediction tasks. This allows DeepSensor users to focus on the science and rapidly iterate on ideas.

    DeepSensor is an experimental package, and we welcome contributions from the community. We have an active Slack channel for code and research discussions; you can join by signing up for the Turing Environment & Sustainability stakeholder community. The form includes a question on signing up for the Slack team, where you can find DeepSensor's channel.

    DeepSensor example application figures

    Why neural processes?

    NPs are a highly flexible class of probabilistic models that offer unique opportunities to model satellite observations, climate model output, and in-situ measurements. Their key features are the ability to:

    • ingest multiple data streams of pointwise or gridded modalities
    • handle missing data and varying resolutions
    • predict at arbitrary target locations
    • quantify prediction uncertainty

    These capabilities make NPs well suited to a range of spatio-temporal data fusion tasks such as downscaling, sensor placement, gap-filling, and forecasting.

    Why DeepSensor?

    This package aims to faithfully match the flexibility of NPs with a simple and intuitive interface. Under the hood, DeepSensor wraps around the powerful neuralprocessess package for core modelling functionality, while allowing users to stay in the familiar xarray and pandas world from end-to-end. DeepSensor also provides convenient plotting tools and active learning functionality for finding optimal sensor placements.

    Documentation

    We have an extensive documentation page here, containing steps for getting started, a user guide built from reproducible Jupyter notebooks, learning resources, research ideas, community information, an API reference, and more!

    DeepSensor Gallery

    For real-world DeepSensor research demonstrators, check out the DeepSensor Gallery. Consider submitting a notebook showcasing your research!

    Deep learning library agnosticism

    DeepSensor leverages the backends package to be compatible with either PyTorch or TensorFlow. Simply import deepsensor.torch or import deepsensor.tensorflow to choose between them!

    Quick start

    Here we will demonstrate a simple example of training a convolutional conditional neural process (ConvCNP) to spatially interpolate random grid cells of NCEP reanalysis air temperature data over the US. First, pip install the package. In this case we will use the PyTorch backend (note: follow the PyTorch installation instructions if you want GPU support).

    bash pip install deepsensor[torch]

    We can go from imports to predictions with a trained model in less than 30 lines of code!

    ```python import deepsensor.torch from deepsensor.data import DataProcessor, TaskLoader from deepsensor.model import ConvNP from deepsensor.train import Trainer

    import xarray as xr import pandas as pd import numpy as np from tqdm import tqdm

    Load raw data

    dsraw = xr.tutorial.opendataset("air_temperature")

    Normalise data

    dataprocessor = DataProcessor(x1name="lat", x2name="lon") ds = dataprocessor(ds_raw)

    Set up task loader

    task_loader = TaskLoader(context=ds, target=ds)

    Set up ConvNP, which by default instantiates a ConvCNP with Gaussian marginals

    model = ConvNP(dataprocessor, taskloader)

    Generate training tasks with up 100 grid cells as context and all grid cells

    as targets

    traintasks = [] for date in pd.daterange("2013-01-01", "2014-11-30")[::7]: Ncontext = np.random.randint(0, 100) task = taskloader(date, contextsampling=Ncontext, targetsampling="all") traintasks.append(task)

    Train model

    trainer = Trainer(model, lr=5e-5) for epoch in tqdm(range(10)): batchlosses = trainer(traintasks)

    Predict on new task with 50 context points and a dense grid of target points

    testtask = taskloader("2014-12-31", contextsampling=50) pred = model.predict(testtask, Xt=dsraw) ```

    After training, the model can predict directly to xarray in your data's original units and coordinate system:

    ```python

    pred["air"] Dimensions: (time: 1, lat: 25, lon: 53) Coordinates: * time (time) datetime64[ns] 2014-12-31 * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 Data variables: mean (time, lat, lon) float32 267.7 267.2 266.4 ... 297.5 297.8 297.9 std (time, lat, lon) float32 9.855 9.845 9.848 ... 1.356 1.36 1.487 ```

    We can also predict directly to pandas containing a timeseries of predictions at off-grid locations by passing a numpy array of target locations to the X_t argument of .predict:

    ```python

    Predict at two off-grid locations over December 2014 with 50 random, fixed context points

    testtasks = taskloader(pd.daterange("2014-12-01", "2014-12-31"), 50, seedoverride=42) pred = model.predict(testtasks, Xt=np.array([[50, 280], [40, 250]]).T) ```

    ```python

    pred["air"] mean std time lat lon
    2014-12-01 50 280 260.282562 5.743976 40 250 270.770111 4.271546 2014-12-02 50 280 255.572098 6.165956 40 250 277.588745 3.727404 2014-12-03 50 280 260.894196 6.02924 ... ... ... 2014-12-29 40 250 266.594421 4.268469 2014-12-30 50 280 250.936386 7.048379 40 250 262.225464 4.662592 2014-12-31 50 280 249.397919 7.167142 40 250 257.955505 4.697775

    [62 rows x 2 columns] ```

    DeepSensor offers far more functionality than this simple example demonstrates. For more information on the package's capabilities, check out the User Guide in the documentation.

    Citing DeepSensor

    If you use DeepSensor in your research, please consider citing this repository. You can generate a BiBTeX entry by clicking the 'Cite this repository' button on the top right of this page.

    Funding

    DeepSensor has received funding from The Alan Turing Institute under the Environmental monitoring: blending satellite and surface data and from ARIA's Forecasting Tipping Points programme and the ARIA 'GRAIL' project. The PI for DeepSensor development is Dr Scott Hosking.

    Contributors

    We appreciate all contributions to DeepSensor, big or small, code-related or not, and we thank all contributors below for supporting open-source software and research. For code-specific contributions, check out our graph of code contributions. See our contribution guidelines if you would like to join this list!

    Alejandro ©
    Alejandro ©

    📓 🐛 🧑‍🏫 🤔 🔬 💻 ⚠️
    Anna Vaughan
    Anna Vaughan

    🔬
    Dani Jones
    Dani Jones

    🐛
    David Wilby
    David Wilby

    📖 ⚠️ 🚧 🐛
    Jim Circadian
    Jim Circadian

    🤔 📆 🚧
    Jonas Scholz
    Jonas Scholz

    📓 🔬 💻 🐛 🤔
    Kalle Westerling
    Kalle Westerling

    📖 🚇 🤔 📆 📣 💬
    Kenza Tazi
    Kenza Tazi

    🤔
    Magnus Ross
    Magnus Ross

    🔣
    Nils Lehmann
    Nils Lehmann

    🤔 📓 🐛
    Paolo Pelucchi
    Paolo Pelucchi

    📓 🐛
    Rohit Singh Rathaur
    Rohit Singh Rathaur

    💻
    Scott Hosking
    Scott Hosking

    🔍 🤔 📆
    Tom Andersson
    Tom Andersson

    💻 🔬 🚧 🐛 ⚠️ 📖 👀 📢 💬
    Wessel
    Wessel

    🔬 💻 🤔
    Zeel B Patel
    Zeel B Patel

    🐛 💻 📓 🤔
    holzwolf
    holzwolf

    🐛
    ots22
    ots22

    🤔
    vinayakrana
    vinayakrana

    📖

Owner

  • Name: The Alan Turing Institute
  • Login: alan-turing-institute
  • Kind: organization
  • Email: info@turing.ac.uk

The UK's national institute for data science and artificial intelligence.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: 'DeepSensor: A Python package for modelling environmental data with convolutional neural processes'
message: >-
  If you use DeepSensor in your research, please cite it
  using the information below.
type: software
authors:
  - given-names: Tom Robin
    family-names: Andersson
    email: tomandersson3@gmail.com
    affiliation: Google DeepMind
    orcid: 'https://orcid.org/0000-0002-1556-9932'
repository-code: 'https://github.com/alan-turing-institute/deepsensor'
abstract: >-
  DeepSensor is a Python package for modelling environmental
  data with convolutional neural processes (ConvNPs).
  ConvNPs are versatile deep learning models capable of
  ingesting multiple environmental data streams of varying
  modalities and resolutions, handling missing data, and
  predicting at arbitrary target locations with uncertainty.
  DeepSensor allows users to tackle a diverse array of
  environmental prediction tasks, including downscaling
  (super-resolution), sensor placement, gap-filling, and
  forecasting. The library includes a user-friendly
  pandas/xarray interface, automatic unnormalisation of
  model predictions, active learning functionality,
  integration with both PyTorch and TensorFlow, and model
  customisation. DeepSensor streamlines and simplifies the
  environmental data modelling pipeline, enabling
  researchers and practitioners to harness the potential of
  ConvNPs for complex environmental prediction challenges.
keywords:
  - machine learning
  - environmental science
  - neural processes
  - active learning
license: MIT
version: 0.4.2
date-released: '2024-10-20'

GitHub Events

Total
  • Create event: 14
  • Commit comment event: 2
  • Release event: 3
  • Issues event: 12
  • Watch event: 40
  • Delete event: 11
  • Issue comment event: 58
  • Push event: 85
  • Pull request review event: 36
  • Pull request review comment event: 36
  • Pull request event: 44
  • Fork event: 6
Last Year
  • Create event: 14
  • Commit comment event: 2
  • Release event: 3
  • Issues event: 12
  • Watch event: 40
  • Delete event: 11
  • Issue comment event: 58
  • Push event: 85
  • Pull request review event: 36
  • Pull request review comment event: 36
  • Pull request event: 44
  • Fork event: 6

Committers

Last synced: 6 months ago

All Time
  • Total Commits: 742
  • Total Committers: 13
  • Avg Commits per committer: 57.077
  • Development Distribution Score (DDS): 0.204
Past Year
  • Commits: 86
  • Committers: 8
  • Avg Commits per committer: 10.75
  • Development Distribution Score (DDS): 0.57
Top Committers
Name Email Commits
Tom Andersson t****d@b****k 591
allcontributors[bot] 4****] 53
davidwilby 2****y 44
Kalle Westerling k****g@b****k 24
Kalle Westerling 7****g 7
Scott Hosking j****g@g****m 5
Jonas Scholz j****3@g****m 4
polpel 5****l 4
RohitRathore1 r****5@g****m 3
Kishan Ved k****d@i****n 2
patel-zeel p****l@i****n 2
Alejandro © a****c@g****m 2
vinayakrana 9****a 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 27
  • Total pull requests: 60
  • Average time to close issues: 3 months
  • Average time to close pull requests: 29 days
  • Total issue authors: 14
  • Total pull request authors: 12
  • Average comments per issue: 3.52
  • Average comments per pull request: 1.57
  • Merged pull requests: 46
  • Bot issues: 0
  • Bot pull requests: 13
Past Year
  • Issues: 8
  • Pull requests: 31
  • Average time to close issues: 8 days
  • Average time to close pull requests: 28 days
  • Issue authors: 8
  • Pull request authors: 8
  • Average comments per issue: 0.63
  • Average comments per pull request: 1.19
  • Merged pull requests: 23
  • Bot issues: 0
  • Bot pull requests: 4
Top Authors
Issue Authors
  • tom-andersson (8)
  • davidwilby (3)
  • DaniJonesOcean (3)
  • scotthosking (2)
  • acocac (2)
  • bnubald (1)
  • feyza-droid (1)
  • simonrolph (1)
  • Opio-Cornelius (1)
  • holzwolf (1)
  • magnusross (1)
  • nilsleh (1)
  • kallewesterling (1)
  • kimbente (1)
Pull Request Authors
  • davidwilby (32)
  • allcontributors[bot] (15)
  • Kishan-Ved (4)
  • MartinSJRogers (4)
  • tom-andersson (2)
  • vinayakrana (2)
  • acocac (2)
  • raybellwaves (2)
  • kallewesterling (2)
  • magnusross (1)
  • RohitRathore1 (1)
  • nilsleh (1)
  • jonas-scholz123 (1)
Top Labels
Issue Labels
bug (7) maintenance (3) good first issue (1) enhancement (1) question (1) thoughts welcome (1) help wanted (1)
Pull Request Labels
bug (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 159 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 29
  • Total maintainers: 2
pypi.org: deepsensor

A Python package for modelling xarray and pandas data with neural processes.

  • Versions: 29
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 159 Last month
Rankings
Dependent packages count: 7.2%
Downloads: 10.5%
Stargazers count: 16.6%
Average: 21.2%
Forks count: 30.3%
Dependent repos count: 41.4%
Maintainers (2)
Last synced: 6 months ago

Dependencies

.github/workflows/publish.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/style.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/tests.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • coverallsapp/github-action v1 composite
.github/workflows/docs.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
  • peaceiris/actions-gh-pages v3 composite
docs/requirements.txt pypi
  • jupyter-book *
  • matplotlib *
  • numpy *
pyproject.toml pypi
requirements/requirements.dev.txt pypi
  • coveralls * development
  • parameterized * development
  • pytest * development
  • pytest-cov * development
  • tox * development
  • tox-gh-actions * development
requirements/requirements.docs.txt pypi
  • jupyter-book ==0.15.1
  • sphinx *
requirements/requirements.txt pypi
  • backends *
  • backends-matrix *
  • dask *
  • distributed *
  • gcsfs *
  • jupyter *
  • matplotlib *
  • neuralprocesses >=0.2.2
  • numpy *
  • pandas *
  • pooch *
  • pyshp *
  • rioxarray *
  • seaborn *
  • shapely *
  • tensorflow *
  • tensorflow_probability *
  • torch >=2
  • tqdm *
  • xarray *
  • zarr *
setup.py pypi