DAPPER

DAPPER: Data Assimilation with Python: a Package for Experimental Research - Published in JOSS (2024)

https://github.com/nansencenter/dapper

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 14 DOI reference(s) in README and JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org
✓
Committers with academic emails
1 of 7 committers (14.3%) from academic institutions
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Keywords

bayesian-filter bayesian-methods chaos data-assimilation enkf kalman kalman-filtering particle-filter state-estimation

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 31% confidence

Last synced: 6 months ago · JSON representation

Repository

Data Assimilation with Python: a Package for Experimental Research

Basic Info

Host: GitHub
Owner: nansencenter
License: mit
Language: Python
Default Branch: master
Homepage: https://nansencenter.github.io/DAPPER
Size: 9.12 MB

Statistics

Stars: 385
Watchers: 26
Forks: 129
Open Issues: 18
Releases: 0

Topics

bayesian-filter bayesian-methods chaos data-assimilation enkf kalman kalman-filtering particle-filter state-estimation

Created over 9 years ago · Last pushed 6 months ago

Metadata Files

Readme

DAPPER is a set of templates for benchmarking the performance of data assimilation (DA) methods. The numerical experiments provide support and guidance for new developments in DA. The typical set-up is a synthetic (twin) experiment, where you specify a dynamic model and an observational model, and use these to generate a synthetic truth (multivariate time series), and then estimate that truth given the models and noisy observations.

Getting started

Read & run examples/basic_1.py and basic_2.py, or their corresponding notebooks (requires Google login).
This screencast provides an overview to DAPPER.
Install.
The documentation includes general guidelines and the API reference, but most users must expect to read the code as well.
If used towards a publication, please cite as The experiments used (inspiration from) DAPPER [ref], version 1.6.0, or similar, where [ref] points to .
Also see the interactive tutorials on DA theory with Python.

Highlights

DAPPER enables the numerical investigation of DA methods through a variety of typical test cases and statistics. It (a) reproduces numerical benchmarks results reported in the literature, and (b) facilitates comparative studies, thus promoting the (a) reliability and (b) relevance of the results. For example, the figure below is generated by docs/examples/basic_3.py, reproduces figure 5.7 of these lecture notes. DAPPER is (c) open source, written in Python, and (d) focuses on readability; this promotes the (c) reproduction and (d) dissemination of the underlying science, and makes it easy to adapt and extend.

Comparative benchmarks with Lorenz-96 plotted as a function of the ensemble size (N)

DAPPER demonstrates how to parallelise ensemble forecasts (e.g., the QG model), local analyses (e.g., the LETKF), and independent experiments (e.g., docs/examples/basic_3.py). It includes a battery of diagnostics and statistics, which all get averaged over subdomains (e.g., "ocean" and "land") and then in time. Confidence intervals are computed, including correction for auto-correlations, and used for uncertainty quantification, and significant digits printing. Several diagnostics are included in the on-line "liveplotting" illustrated below, which may be paused for further interactive inspection. In summary, DAPPER is well suited for teaching and fundamental DA research. Also see its drawbacks.

EnKF - Lorenz-96

Installation

Successfully tested on Linux/Mac/Windows.

Prerequisite: Python>=3.9

If you're an expert, setup a python environment however you like. Otherwise: Install Anaconda, then open the Anaconda terminal and run the following commands:

sh conda create --yes --name dapper-env python=3.12 conda activate dapper-env python --version

Ensure the printed version is as desired. Keep using the same terminal for the commands below.

Install

Either: Install for development (recommended)

Do you want the DAPPER code available to play around with? Then

Download and unzip (or git clone) DAPPER.
Move the resulting folder wherever you like,
and cd into it (ensure you're in the folder with a setup.py file).
pip install -e '.'

Or: Install as library

Do you just want to run a script that requires DAPPER? Then

If the script comes with a requirements.txt file that lists DAPPER, then do
pip install -r path/to/requirements.txt.
If not, hopefully you know the version of DAPPER needed. Run
pip install dapper==1.6.0 to get version 1.6.0 (as an example).

Finally: Test the installation

You should now be able to do run your script with python path/to/script.py.
For example, if you are in the DAPPER dir,

python docs/examples/basic_1.py

PS: If you closed the terminal (or shut down your computer), you'll first need to run conda activate dapper-env

DA methods

Method | Literature reproduced ------------------------------------------------------ | ------------------------ EnKF ¹ | Sakov08, Hoteit15, Grudzien2020 EnKF-N | Bocquet12, Bocquet15 EnKS, EnRTS | Raanes2016 iEnKS / iEnKF / EnRML / ES-MDA ² | Sakov12, Bocquet12, Bocquet14 LETKF, local & serial EAKF | Bocquet11 Sqrt. model noise methods | Raanes2014 Particle filter (bootstrap) ³ | Bocquet10 Optimal/implicit Particle filter ³ | Bocquet10 NETF | Tödter15, Wiljes16 Rank histogram filter (RHF) | Anderson10 4D-Var | 3D-Var | Extended KF | Optimal interpolation | Climatology |

¹: Stochastic, DEnKF (i.e. half-update), ETKF (i.e. sym. sqrt.). Serial forms are also available.
Tuned with inflation and "random, orthogonal rotations".
²: Also supports the bundle version, and "EnKF-N"-type inflation.
³: Resampling: multinomial (including systematic/universal and residual).
The particle filter is tuned with "effective-N monitoring", "regularization/jittering" strength, and more.

For a list of ready-made experiments with suitable, tuned settings for a given method (e.g., the iEnKS), use:

sh grep -r "xp.*iEnKS" dapper/mods

Test cases (models)

Simple models facilitate the reliability, reproducibility, and interpretability of experiment results.

Model | Lin | TLM** | PDE? | Phys.dim. | State len | Lyap≥0 | Implementer ----------- | --- | ----- | ---- | --------- | --------- | ------ | ---------- Id | Yes | Yes | No | N/A | * | 0 | Raanes Linear Advect. (LA) | Yes | Yes | Yes | 1d | 1000 * | 51 | Evensen/Raanes DoublePendulum | No | Yes | No | 0d | 4 | 2 | Matplotlib/Raanes Ikeda | No | Yes | No | 0d | 2 | 1 | Raanes LotkaVolterra | No | Yes | No | 0d | 5 * | 1 | Wikipedia/Raanes Lorenz63 | No | Yes | "Yes" | 0d | 3 | 2 | Sakov Lorenz84 | No | Yes | No | 0d | 3 | 2 | Raanes Lorenz96 | No | Yes | No | 1d | 40 * | 13 | Raanes Lorenz96s | No | Yes | No | 1d | 10 * | 4 | Grudzien LorenzUV | No | Yes | No | 2x 1d | 256 + 8 * | ≈60 | Raanes LorenzIII | No | No | No | 1d | 960 * | ≈164 | Raanes Vissio-Lucarini 20 | No | Yes | No | 1d | 36 * | 10 | Yumeng Kuramoto-Sivashinsky | No | Yes | Yes | 1d | 128 * | 11 | Kassam/Raanes Quasi-Geost (QG) | No | No | Yes | 2d | 129²≈17k | ≈140 | Sakov

*: Flexible; set as necessary
**: Tangent Linear Model included?

The models are found as subdirectories within dapper/mods. A model should be defined in a file named __init__.py, and illustrated by a file named demo.py. Most other files within a model subdirectory are usually named authorYEAR.py and define a HMM object, which holds the settings of a specific twin experiment, using that model, as detailed in the corresponding author/year's paper. A list of these files can be obtained using

sh find dapper/mods -iname '[a-z]*[0-9]*.py'

Some files contain settings used by several papers. Moreover, at the bottom of each such file should be (in comments) a list of suitable, tuned settings for various DA methods, along with their expected, average rmse.a score for that experiment. As mentioned above, DAPPER reproduces literature results. You will also find results that were not reproduced by DAPPER.

Similar projects

DAPPER is aimed at research and teaching (see discussion up top). Example of limitations:

It is not suited for very big models (>60k unknowns).
Non-uniform time sequences.

The scope of DAPPER is restricted because

$framework_to_language$

Moreover, even straying beyond basic configurability appears unrewarding when already building on a high-level language such as Python. Indeed, you may freely fork and modify the code of DAPPER, which should be seen as a set of templates, and not a framework.

Also, DAPPER comes with no guarantees/support. Therefore, if you have an operational or real-world application, such as WRF, you should look into one of the alternatives, sorted by approximate project size.

Name | Developers | Purpose (approximately) ------------------ | --------------------- | ----------------------------- DART | NCAR | General PDAF | AWI | General JEDI | JCSDA (NOAA, NASA, ++)| General OpenDA | TU Delft | General EMPIRE | Reading (Met) | General ERT | Statoil | History matching (Petroleum DA) PIPT | CIPR | History matching (Petroleum DA) MIKE | DHI | Oceanographic OAK | Liège | Oceanographic Siroco | OMP | Oceanographic Verdandi | INRIA | Biophysical DA PyOSSE | Edinburgh, Reading | Earth-observation DA

Below is a list of projects with a purpose more similar to DAPPER's (research in DA, and not so much using DA):

Name | Developers | Notes ------------------------------------ | ---------------------- | --------------------------------- DAPPER | Raanes, Chen, Grudzien | Python SANGOMA | Conglomerate* | Fortran, Matlab hIPPYlib | Villa, Petra, Ghattas | Python, adjoint-based PDE methods FilterPy | R. Labbe | Python. Engineering oriented. DASoftware | Yue Li, Stanford | Matlab. Large inverse probs. Pomp | U of Michigan | R EnKF-Matlab | Sakov | Matlab EnKF-C | Sakov | C. Light-weight, off-line DA pyda | Hickman | Python PyDA | Shady-Ahmed | Python DasPy | Xujun Han | Python DataAssim.jl | Alexander-Barth | Julia DataAssimilationBenchmarks.jl | Grudzien | Julia, Python EnsembleKalmanProcesses.jl | Clim. Modl. Alliance | Julia, EKI (optim) Datum | Raanes | Matlab IEnKS code | Bocquet | Python

The EnKF-Matlab and IEnKS codes have been inspirational in the development of DAPPER.

*: AWI/Liege/CNRS/NERSC/Reading/Delft

Contributing

Issues and Pull requests

Do not hesitate to open an issue, whether to report a problem or ask a question. It may take some time for us to get back to you, since DAPPER is primarily a volunteer effort. Please start by perusing the documentation and searching the issue tracker for similar items.

Pull requests are very welcome. Examples: adding a new DA method, dynamical models, experimental configuration reproducing literature results, or improving the features and capabilities of DAPPER. Please keep in mind the intentional limitations and read the developers guidelines.

Contributors

Patrick N. Raanes, Yumeng Chen, Colin Grudzien, Maxime Tondeur, Remy Dubois

DAPPER is developed and maintained at NORCE (Norwegian Research Institute) and the Nansen Environmental and Remote Sensing Center (NERSC), in collaboration with the University of Reading, the UK National Centre for Earth Observation (NCEO), and the Center for Western Weather and Water Extremes (CW3E).

Publications

Owner

Name: Nansen Environmental and Remote Sensing Center
Login: nansencenter
Kind: organization
Email: post@nersc.no
Location: Bergen, Norway

Website: www.nersc.no
Twitter: nansensenteret
Repositories: 105
Profile: https://github.com/nansencenter

JOSS Publication

DAPPER: Data Assimilation with Python: a Package for Experimental Research

Published

February 29, 2024

DOI

10.21105/joss.05150

Volume 9, Issue 94, Page 5150

Authors

Patrick N. Raanes

NORCE, Bergen, Norway, NERSC, Bergen, Norway

Yumeng Chen

Department of Meteorology and NCEO, University of Reading, Reading, UK

Colin Grudzien

CW3E - Scripps Institution of Oceanography, University of California, San Diego, USA, Department of Mathematics and Statistics, University of Nevada, Reno, USA

Editor

Fei Tao

GitHub Events

Total

Issues event: 6
Watch event: 36
Delete event: 1
Issue comment event: 6
Push event: 94
Fork event: 5
Create event: 2

Last Year

Issues event: 6
Watch event: 36
Delete event: 1
Issue comment event: 6
Push event: 94
Fork event: 5
Create event: 2

Committers

Last synced: 7 months ago

All Time

Total Commits: 1,642
Total Committers: 7
Avg Commits per committer: 234.571
Development Distribution Score (DDS): 0.063

Past Year

Commits: 64
Committers: 1
Avg Commits per committer: 64.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
patricknraanes	p**s@g**m	1,538
Yumeng Chen	y**n@r**k	43
Colin Grudzien	c**z@m**g	40
Adrien Perrin	a**n@n**o	14
Feda Curic	f**c@g**m	4
Colin Grudzien	c**n@C**l	2
Sergey Alyaev	c**t@g**m	1

Committer Domains (Top 20 + Academic)

nersc.no: 1 mailbox.org: 1 reading.ac.uk: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 74
Total pull requests: 33
Average time to close issues: 9 months
Average time to close pull requests: 3 days
Total issue authors: 21
Total pull request authors: 7
Average comments per issue: 1.84
Average comments per pull request: 2.91
Merged pull requests: 24
Bot issues: 0
Bot pull requests: 1

Past Year

Issues: 2
Pull requests: 0
Average time to close issues: 9 days
Average time to close pull requests: N/A
Issue authors: 2
Pull request authors: 0
Average comments per issue: 0.5
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

patnr (43)
yumengch (9)
cgrudz (3)
Balinus (2)
atantet (1)
ajikmr (1)
TobiasGlaubach (1)
sgrzegorz (1)
Himscipy (1)
didiermonselesan (1)
Liu-Jincan (1)
xhu4 (1)
wispcarey (1)
palvors (1)
fengzhoushui (1)

Pull Request Authors

yumengch (12)
cgrudz (11)
dafeda (4)
patnr (3)
dependabot[bot] (1)
aperrin66 (1)
alin256 (1)

Top Labels

Issue Labels

enhancement (36) good first issue (5) bug (5) question (4)

Pull Request Labels

dependencies (1)

Packages

Total packages: 2
Total downloads:
- pypi 228 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 2
(may contain duplicates)
Total versions: 23
Total maintainers: 1

pypi.org: dapper

DAPPER benchmarks the performance of data assimilation (DA) methods.

Documentation: https://nansencenter.github.io/DAPPER/
License: MIT License
Latest release: 1.7.1
published 10 months ago

Versions: 8
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 171 Last month

Rankings

Stargazers count: 3.6%

Forks count: 4.4%

Dependent packages count: 10.1%

Average: 11.0%

Downloads: 15.3%

Dependent repos count: 21.6%

Maintainers (1)

patricknraanes

Last synced: 6 months ago

pypi.org: da-dapper

DAPPER benchmarks the performance of data assimilation (DA) methods.

Documentation: https://nansencenter.github.io/DAPPER/
License: MIT License
Latest release: 1.2.2
published over 4 years ago

Versions: 15
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 57 Last month

Rankings

Stargazers count: 3.6%

Forks count: 4.4%

Dependent packages count: 10.1%

Average: 13.8%

Dependent repos count: 21.6%

Downloads: 29.3%

Maintainers (1)

patricknraanes

Last synced: 6 months ago

Dependencies

.github/workflows/deploy-docs.yml actions

actions/checkout v3 composite
actions/deploy-pages v1 composite
actions/setup-python v4 composite
actions/upload-pages-artifact v1 composite

.github/workflows/tests.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite
conda-incubator/setup-miniconda v2 composite

pyproject.toml pypi

setup.py pypi

DAPPER

Science Score: 95.0%

Keywords

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Getting started

Highlights

Installation

Prerequisite: Python>=3.9

Install

Either: Install for development (recommended)

Or: Install as library

Finally: Test the installation

DA methods

Test cases (models)

Similar projects

Contributing

Issues and Pull requests

Contributors

Publications

Owner

JOSS Publication

DAPPER: Data Assimilation with Python: a Package for Experimental Research

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: dapper

Rankings

Maintainers (1)

pypi.org: da-dapper

Rankings

Maintainers (1)

Dependencies