https://github.com/chrisschuerz/lstm_for_pub
Code for our WRR paper "Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning"
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: wiley.com -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.4%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Code for our WRR paper "Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning"
Basic Info
- Host: GitHub
- Owner: chrisschuerz
- License: apache-2.0
- Default Branch: master
- Homepage: https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019WR026065
- Size: 24.7 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of kratzert/lstm_for_pub
Created over 3 years ago
· Last pushed over 5 years ago
https://github.com/chrisschuerz/lstm_for_pub/blob/master/
# Long Short-Term Memory networks for Prediction in Ungauged Basins:
Accompanying code for our WRR submission ["Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning"](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019WR026065)
# Steps to Recreate Results from Paper:
1) Get CAMELS data from [https://ral.ucar.edu/solutions/products/camels](https://ral.ucar.edu/solutions/products/camels). The filepath must be: './data/basin_dataset_public_v1p2' and must include the CAMELS attributes as a subdirectory: './data/basin_dataset_public_v1p2/camels_attributes_v2.0'.
2) Download the updated NLDAS forcings from [HydroShare](https://www.hydroshare.org/resource/0a68bfd7ddf642a8be9041d60f40868c/). These include daily min and max temperature, compared to the CAMELS NLDAS forcings that only contain daily mean temperature.
3) Run the training scripts: 'train_global.sh' or 'train_pub.sh'. Options for the global script are: (i) the model type: 'lstm' or 'ealstm' (ref HESSD paper here), and (ii) the option to use catchment attributes as static input features: 'static' or 'no_static'. Options for the PUB training script are just the model type, since PUB requires catchment attributes.
These bash scripts assume that you have a certain number of GPUs available for training. If no GPUS are available for training, the 'gpu=' arguments in the runtime lines (e.g., 'python3 main.py ...') must be changed to 'gpu=-1'. The number of GPUs available on the current machine goes in line 10 (global) / 16 (PUB) and the index for the last GPU goes in line 40 (global) / 36 (PUB).
These scripts are set up to run 10 random restarts of each type of experiment. The PUB experiments use k-fold (cross-site) validation with k=12 splits. These parameters can be changed in the bash traning scripts.
Runtime progress can be monitored in the 'reports' subdirectory. Each experiment type (e.g., 'global_lstm_static') will create a separate runtime file for each restart and each k-fold split, numbered appropriately. Tail these to see real-time training progress.
3) Run the test scripts: 'run_global.py' or 'run_pub.py'. Options for these include (i) the experiment name and (ii) the GPU index that you want to run on (use -1 to indicate running on the CPU). The experiment name is the file name (less any numeric identifiers) of the training report file. Outputs from these runs are stored in CSV (human-readable) files in the './analysis/resutls/' subdirectory.
4) Run the 'extract_benchmarks.py' script to prepare the benchmark data for statistical analysis. Results will be stored in CSV (human-readable) files in the './analysis/results_data/' subdirectory.
5) In the 'analysis' subdirectory, run the 'main_performance_ensemble_only.py' or 'main_performance.py' scripts to get ensemble performance statistics or basin performance statistics, respectively. These statistics are stored in the './analysis/stats' subdirectory.
6) Run the matlab script 'main_plots.m' in the 'analysis' subdirectory to make plots like what are in the paper. Figures are stored in the './analysis/figures/' subdirectory.
## Contact
Frederik Kratzert: kratzert@ml.jku.at
## Citation
If you use any of this code in your experiments, please make sure to cite the following publication
**Note**: At this point, the paper is accepted, yet online online as preview and no further information about the volumne and pages are available. Check the [WRR Homepage](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019WR026065) for an update of the citation.
```
@article{kratzert2019pub,
author = {Kratzert, Frederik and Klotz, Daniel and Herrnegger, Mathew and Sampson, Alden K. and Hochreiter, Sepp and Nearing, Grey S.},
title = {Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning},
journal = {Water Resources Research},
volume = {n/a},
number = {n/a},
pages = {},
doi = {10.1029/2019WR026065}
}
```
## License of our code
[Apache License 2.0](https://github.com/kratzert/ealstm_regional_modeling/blob/master/LICENSE)
Owner
- Name: Christoph Schuerz
- Login: chrisschuerz
- Kind: user
- Location: Berlin, Germany
- Company: UFZ Leipzig, Germany
- Repositories: 2
- Profile: https://github.com/chrisschuerz