PyESD

Python Package for Empirical Statistical Downscaling. pyESD is under active development and all colaborators are welcomed. The purpose of the package is to downscale any climate variables e.g. precipitation and temperature using predictors from reanalysis datasets (eg. ERA5) to point scale. pyESD adopts many ML and AL as the transfer function.

https://github.com/Dan-Boat/PyESD

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.6%) to scientific vocabulary

Keywords

deep-learning downscaling ensemble-machine-learning machine-learning precipitation sckit-learn tensorflow2
Last synced: 6 months ago · JSON representation

Repository

Python Package for Empirical Statistical Downscaling. pyESD is under active development and all colaborators are welcomed. The purpose of the package is to downscale any climate variables e.g. precipitation and temperature using predictors from reanalysis datasets (eg. ERA5) to point scale. pyESD adopts many ML and AL as the transfer function.

Basic Info
Statistics
  • Stars: 54
  • Watchers: 4
  • Forks: 12
  • Open Issues: 0
  • Releases: 2
Topics
deep-learning downscaling ensemble-machine-learning machine-learning precipitation sckit-learn tensorflow2
Created over 4 years ago · Last pushed about 1 year ago
Metadata Files
Readme Changelog License Authors

README.md

Python Package for Empirical Statistical Downscaling (v1.01) :sunbehindraincloud: :cloudwithsnow: :cloudwith_rain: :fire: :thermometer:

PyESD is an open-source framework of the Perfect Prognosis approach of statistical downscaling of any climate-related variable such as precipitation, temperature, and wind speed using reanalysis products eg. ERA5 as predictors. The package features all the downscaling cycles including data preprocessing, predictor selection, constructions (eg. using transformers), model selection, training, validation and evaluation, and future prediction. The package serves as the means of downscaling General Circulation Models of future climate to high resolution relevant for climate impact assessment such as droughts, flooding, wildfire risk, and others. The main specialties of the pyESD include:

  • Well designed in an OOP style that considers weather stations as individual objects and all the downscaling routines as attributes. This ensures fewer lines of code that cover the end-to-end downscaling of climate change variable products.
  • PyESD features many machine learning algorithms and predictor selection techniques that can be experimented with toward the selection and design of robust transfer functions that can be coupled with GCM to generate future estimates of climate change
  • And many other functionalities that are highlighted in the paper description of the package (to be submitted).

Documentation :blue_book:

The package documentation is accessible at https://dan-boat.github.io/PyESD/

The main component and the work flow of the package are summarised in the modeling outline:

Model outline

Installation :hammerandwrench: 1. Install the standard version: pip install pyESD from PyPI or git clone git@github.com:Dan-Boat/PyESD.git cd to the folder | pip install .

  1. Install in editable mode: pip install -e pyESD or pip install -e . in the package base folder clone from github The installation might require some dependencies that must be installed if not successful from the distribution from PyPI: cartopy, xarray, sciki-learn, scipy and the other scientific frameworks such as NumPy, pandas, Matplotlib, and seaborn
  2. Alternatively, to ensure the installation in an isolated environment, virtual python environment using conda or virtualenv can be used to create a separate env for the package installation

Examples

The package has been used for downscaling precipitation and temperature for a catchment located in southwestern Germnany. We have also used it for generating future rainfall products for all the synoptic weather stations in Ghana. Their respective control scripts are located in the examples folder. Generally, the control scripts follow the modeling workflow as shown in: Downscaling steps. For instance, the downscaling framework show below can be experimented with to select the robust predictor selection method and emprical transfer function for a specific location and predictand variable. modeling framework

Workflow demonstration: To use the PP-ESD model to downscale climate model, weather station and reanalysis datasets are required. The predictors are loaded in as netCDF files and the predictand as csv file. Let assume that the various predictor variables are stored locally in the era5_datadirdirectory /home/daniel/ERA5/ and the predictand variable eg. precipitation is stored in station_dir The files should have the same timestamp as the interested predictand variable 1. import all the required modules ``` from pyESD.Weatherstation import readstationcsv from pyESD.standardizer import MonthlyStandardizer, StandardScaling from pyESD.ESDutils import storepickle, storecsv from pyESD.splitter import KFold from pyESD.ESDutils import Dataset from pyESD.Weatherstation import read_weatherstationnames

import pandas as pd ``` 2. Read the datasets

ERA5Data = Dataset('ERA5', { 't2m':os.path.join(era5_datadir, 't2m_monthly.nc'), 'msl':os.path.join(era5_datadir, 'msl_monthly.nc'), 'u10':os.path.join(era5_datadir, 'u10_monthly.nc'), 'v10':os.path.join(era5_datadir, 'v10_monthly.nc'), 'z250':os.path.join(era5_datadir, 'z250_monthly.nc'), 3. define potential predictors and radius of predictor construction, time range for model training and evaluation

``` radius = 100 #km predictors = ["t2m", "tp","msl", "v10", "u10"]

from1958to2010 = pd.date_range(start="1958-01-01", end="2010-12-31", freq="MS") #training and validation

from2011to2020 = pd.date_range(start="2011-01-01", end="2020-12-31", freq="MS") # testing trained data ```

  1. Read weather stations as objects and apply the downscaling cycle attributes. Note that running the model the first time for a specific location extract the regional means using the define radius and location of the station. The extracted means are stored in a pickel files in the directory called predictor_dir

``` variable = "Precipitation" SO = readstationcsv(filename=station_dir, varname=variable)

USING ERA5 DATA

================

setting predictors

SO.set_predictors(variable, predictors, predictordir, radius,)

setting standardardizer

SO.set_standardizer(variable, standardizer=MonthlyStandardizer(detrending=False, scaling=False))

scoring = ["negrootmeansquarederror", "r2", "negmeanabsolute_error"]

setting model

regressor = "RandomForest" SO.setmodel(variable, method=regressor, daterange=from1958to2010, predictordataset=ERA5Data, cv=KFold(n_splits=10), scoring = scoring)

MODEL TRAINING (1958-2000)

==========================

SO.fit(variable, from1958to2010, ERA5Data, fitpredictors=True, predictorselector=True, selectormethod="Recursive" , selectorregressor="ARD", calrelativeimportance=False)

score1958to2010, ypred1958to2010 = SO.crossvalidateand_predict(variable, from1958to2010, ERA5Data)

score_2011to2020 = SO.evaluate(variable, from2011to2020, ERA5Data)

ypred_1958to2010 = SO.predict(variable, from1958to2010, ERA5Data)

ypred_2011to2020 = SO.predict(variable, from2011to2020, ERA5Data)

ypred_2011to2020.plot() plt.show()

```

Package testing

The package is tested using the unittest framework with synthetic generated data. The testing scripts are located in the test folder. Running the various scripts with -v flag (higher level of verbose), would validate the modified version of the package.

Publications

The package description and application paper is currently under review in GCM (Boateng & Mutz 2023) Its application for weather station in Ghana was presented at the AGU22 Link and the paper is under preparation

Citation: Upload on zenodo: https://doi.org/10.5281/zenodo.7748769

Collaborators are welcomed: interms of model application, model improvement, documentation and expansion of the package!

@ Daniel Boateng (linkedin) : University of Tuebingen :incoming_envelope: dannboateng@gmail.com

Owner

  • Name: Daniel Boateng
  • Login: Dan-Boat
  • Kind: user
  • Location: Tübingen, Germany
  • Company: University of Tübingen

Research Fellow | PhD Student Palaeoclimate Modelling and Climate Dynamics

GitHub Events

Total
  • Issues event: 2
  • Watch event: 8
  • Issue comment event: 1
  • Push event: 1
  • Fork event: 1
Last Year
  • Issues event: 2
  • Watch event: 8
  • Issue comment event: 1
  • Push event: 1
  • Fork event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 168
  • Total Committers: 1
  • Avg Commits per committer: 168.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 10
  • Committers: 1
  • Avg Commits per committer: 10.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Dan-Boat d****g@g****m 168

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 3
  • Total pull requests: 0
  • Average time to close issues: about 2 months
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 0
  • Average comments per issue: 2.67
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: 15 days
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • gkb999 (2)
  • msw09090 (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 3
  • Total downloads:
    • pypi 20 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 7
  • Total maintainers: 1
proxy.golang.org: github.com/dan-boat/pyesd
  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago
proxy.golang.org: github.com/Dan-Boat/PyESD
  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago
pypi.org: pyesd

Python Package for Empirical Statistical Downscaling. pyESD is under active development and all colaborators are welcomed. The purpose of the package is to downscale any climate variables e.g. precipitation and temperature using predictors from reanalysis datasets (eg. ERA5) to point scale. pyESD adopts many ML and AL as the transfer function.

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 20 Last month
Rankings
Dependent packages count: 6.6%
Stargazers count: 18.6%
Average: 21.6%
Forks count: 30.5%
Dependent repos count: 30.6%
Maintainers (1)
Last synced: 6 months ago

Dependencies

setup.py pypi
  • Cartopy *
  • eofs *
  • keras *
  • mpi4py *
  • numpy *
  • pandas *
  • scipy *
  • seaborn *
  • sklearn *
  • statsmodels *
  • tensorflow *
  • xarray *
requirements.txt pypi
  • Keras-Preprocessing >=1.1.2
  • cftime >=1.6.0
  • eofs >=1.4.0
  • geopandas ==0.12.1
  • keras >=2.8.0
  • numpy >=1.19.5
  • pandas >=1.3.4
  • scikit-image >=0.18.3
  • scikit-learn >=1.1.1
  • scikit-learn-intelex >=2021.20210714.120553
  • scikit-optimize >=0.9.0
  • scipy >=1.7.1
  • seaborn >=0.11.2
  • setuptools >=67.6.0
  • tensorflow >=2.8.0
  • twine >=4.0.2
  • xarray >=2022.3.0
  • xgboost >=1.5.2
requirements_dev.txt pypi
  • Sphinx * development
  • Sphinx-rtd-theme * development
  • bump2version * development
  • coverage * development
  • flake8 * development
  • grip * development
  • h5netcdf * development
  • importlib-metadata >=4.4 development
  • matplotlib * development
  • netCDF4 * development
  • packaging >=21.3 development
  • pip * development
  • pydocstyle * development
  • pytest * development
  • pytest-cov * development
  • pytest-runner * development
  • setuptools * development
  • tox * development
  • twine * development
  • watchdog * development
  • wheel * development
environment.yml conda
  • _ipyw_jlab_nb_ext_conf ==0.1.0
  • alabaster ==0.7.12
  • anaconda-client ==1.9.0
  • anaconda-project ==0.10.1
  • anyio ==2.2.0
  • appdirs ==1.4.4
  • argh ==0.26.2
  • argon2-cffi ==20.1.0
  • arrow ==0.13.1
  • asn1crypto ==1.4.0
  • astroid ==2.6.6
  • astropy ==4.3.1
  • async_generator ==1.10
  • atomicwrites ==1.4.0
  • attrs ==21.2.0
  • autopep8 ==1.5.7
  • babel ==2.9.1
  • backcall ==0.2.0
  • backports ==1.0
  • backports.functools_lru_cache ==1.6.4
  • backports.shutil_get_terminal_size ==1.0.0
  • backports.tempfile ==1.0
  • backports.weakref ==1.0.post1
  • bcrypt ==3.2.0
  • beautifulsoup4 ==4.10.0
  • binaryornot ==0.4.4
  • bitarray ==2.3.0
  • bkcharts ==0.2
  • black ==19.10b0
  • blas ==1.0
  • bleach ==4.0.0
  • blosc ==1.21.0
  • bokeh ==2.4.1
  • boto ==2.49.0
  • bottleneck ==1.3.2
  • brotli ==1.0.9
  • brotlipy ==0.7.0
  • bzip2 ==1.0.8
  • ca-certificates
  • cached-property ==1.5.2
  • cartopy
  • certifi
  • cffi ==1.14.6
  • cfitsio ==3.470
  • chardet ==4.0.0
  • charls ==2.2.0
  • charset-normalizer ==2.0.4
  • click ==8.0.3
  • cloudpickle ==2.0.0
  • clyent ==1.2.2
  • colorama ==0.4.4
  • comtypes ==1.1.10
  • conda-content-trust ==0.1.1
  • conda-pack ==0.6.0
  • conda-package-handling ==1.7.3
  • conda-repo-cli ==1.0.4
  • conda-verify ==3.4.2
  • contextlib2 ==0.6.0.post1
  • cookiecutter ==1.7.2
  • cryptography ==3.4.8
  • curl ==7.78.0
  • cycler ==0.10.0
  • cython ==0.29.24
  • cytoolz ==0.11.0
  • daal4py ==2021.3.0
  • dal ==2021.3.0
  • dask ==2021.10.0
  • dask-core ==2021.10.0
  • dataclasses ==0.8
  • debugpy ==1.4.1
  • decorator ==5.1.0
  • defusedxml ==0.7.1
  • diff-match-patch ==20200713
  • distributed ==2021.10.0
  • docutils ==0.17.1
  • entrypoints ==0.3
  • et_xmlfile ==1.1.0
  • fastcache ==1.1.0
  • filelock ==3.3.1
  • flake8 ==3.9.2
  • flask ==1.1.2
  • fonttools ==4.25.0
  • freetype ==2.10.4
  • fsspec ==2021.10.1
  • future ==0.18.2
  • get_terminal_size ==1.0.0
  • gevent ==21.8.0
  • giflib ==5.2.1
  • glob2 ==0.7
  • greenlet ==1.1.1
  • h5py ==3.2.1
  • hdf5 ==1.10.6
  • heapdict ==1.0.1
  • html5lib ==1.1
  • icc_rt ==2019.0.0
  • icu ==58.2
  • idna ==3.2
  • imagecodecs ==2021.8.26
  • imageio ==2.9.0
  • imagesize ==1.2.0
  • importlib-metadata ==4.8.1
  • importlib_metadata ==4.8.1
  • inflection ==0.5.1
  • iniconfig ==1.1.1
  • intel-openmp ==2021.4.0
  • intervaltree ==3.1.0
  • ipykernel ==6.4.1
  • ipython ==7.29.0
  • ipython_genutils ==0.2.0
  • ipywidgets ==7.6.5
  • isort ==5.9.3
  • itsdangerous ==2.0.1
  • jdcal ==1.4.1
  • jedi ==0.18.0
  • jinja2 ==2.11.3
  • jinja2-time ==0.2.0
  • joblib ==1.1.0
  • jpeg ==9d
  • json5 ==0.9.6
  • jsonschema ==3.2.0
  • jupyter ==1.0.0
  • jupyter_client ==6.1.12
  • jupyter_console ==6.4.0
  • jupyter_core ==4.8.1
  • jupyter_server ==1.4.1
  • jupyterlab ==3.2.1
  • jupyterlab_pygments ==0.1.2
  • jupyterlab_server ==2.8.2
  • jupyterlab_widgets ==1.0.0
  • keyring ==23.1.0
  • kiwisolver ==1.3.1
  • krb5 ==1.19.2
  • lazy-object-proxy ==1.6.0
  • lcms2 ==2.12
  • lerc ==3.0
  • libaec ==1.0.4
  • libarchive ==3.4.2
  • libcurl ==7.78.0
  • libdeflate ==1.8
  • libiconv ==1.15
  • liblief ==0.10.1
  • libpng ==1.6.37
  • libspatialindex ==1.9.3
  • libssh2 ==1.9.0
  • libtiff ==4.2.0
  • libwebp ==1.2.0
  • libxml2 ==2.9.12
  • libxslt ==1.1.34
  • libzopfli ==1.0.3
  • llvmlite ==0.37.0
  • locket ==0.2.1
  • lxml ==4.6.3
  • lz4-c ==1.9.3
  • lzo ==2.10
  • m2w64-gcc-libgfortran ==5.3.0
  • m2w64-gcc-libs ==5.3.0
  • m2w64-gcc-libs-core ==5.3.0
  • m2w64-gmp ==6.1.0
  • m2w64-libwinpthread-git ==5.0.0.4634.697f757
  • m2w64-toolchain
  • markupsafe ==1.1.1
  • matplotlib ==3.4.3
  • matplotlib-base ==3.4.3
  • matplotlib-inline ==0.1.2
  • mccabe ==0.6.1
  • menuinst ==1.4.18
  • mistune ==0.8.4
  • mkl ==2021.4.0
  • mkl-service ==2.4.0
  • mkl_fft ==1.3.1
  • mkl_random ==1.2.2
  • mock ==4.0.3
  • more-itertools ==8.10.0
  • mpmath ==1.2.1
  • msgpack-python ==1.0.2
  • msys2-conda-epoch ==20160418
  • multipledispatch ==0.6.0
  • munkres ==1.1.4
  • mypy_extensions ==0.4.3
  • navigator-updater ==0.2.1
  • nbclassic ==0.2.6
  • nbclient ==0.5.3
  • nbconvert ==6.1.0
  • nbformat ==5.1.3
  • nest-asyncio ==1.5.1
  • networkx ==2.6.3
  • nltk ==3.6.5
  • nose ==1.3.7
  • notebook ==6.4.5
  • numba ==0.54.1
  • numexpr ==2.7.3
  • numpy ==1.20.3
  • numpydoc
  • olefile ==0.46
  • openjpeg ==2.4.0
  • openpyxl ==3.0.9
  • openssl
  • packaging ==21.0
  • pandas ==1.3.4
  • pandocfilters ==1.4.3
  • paramiko ==2.7.2
  • parso ==0.8.2
  • partd ==1.2.0
  • path ==16.0.0
  • path.py ==12.5.0
  • pathlib2 ==2.3.6
  • pathspec ==0.7.0
  • patsy ==0.5.2
  • pep8 ==1.7.1
  • pexpect ==4.8.0
  • pickleshare ==0.7.5
  • pillow ==8.4.0
  • pip ==21.2.4
  • pkginfo ==1.7.1
  • pluggy ==0.13.1
  • ply ==3.11
  • poyo ==0.5.0
  • prometheus_client ==0.11.0
  • prompt-toolkit ==3.0.20
  • prompt_toolkit ==3.0.20
  • psutil ==5.8.0
  • ptyprocess ==0.7.0
  • py ==1.10.0
  • py-lief ==0.10.1
  • pycodestyle ==2.7.0
  • pycosat ==0.6.3
  • pycparser ==2.20
  • pycurl ==7.44.1
  • pydocstyle ==6.1.1
  • pyerfa ==2.0.0
  • pyflakes ==2.3.1
  • pygments ==2.10.0
  • pyjwt ==2.1.0
  • pylint ==2.9.6
  • pyls-spyder ==0.4.0
  • pynacl ==1.4.0
  • pyodbc ==4.0.31
  • pyopenssl ==21.0.0
  • pyparsing ==3.0.4
  • pyqt ==5.9.2
  • pyreadline ==2.1
  • pyrsistent ==0.18.0
  • pysocks ==1.7.1
  • pytables ==3.6.1
  • pytest ==6.2.4
  • python ==3.9.7
  • python-dateutil ==2.8.2
  • python-libarchive-c ==2.9
  • python-lsp-black ==1.0.0
  • python-lsp-jsonrpc ==1.0.0
  • python-lsp-server ==1.2.4
  • python-slugify ==5.0.2
  • pytz ==2021.3
  • pywavelets ==1.1.1
  • pywin32 ==228
  • pywin32-ctypes ==0.2.0
  • pywinpty ==0.5.7
  • pyyaml ==6.0
  • pyzmq ==22.2.1
  • qdarkstyle ==3.0.2
  • qstylizer ==0.1.10
  • qt ==5.9.7
  • qtawesome ==1.0.2
  • qtconsole ==5.1.1
  • qtpy ==1.10.0
  • regex ==2021.8.3
  • requests ==2.26.0
  • rope ==0.19.0
  • rtree ==0.9.7
  • ruamel_yaml ==0.15.100
  • scikit-image ==0.18.3
  • scikit-learn ==0.24.2
  • scikit-learn-intelex
  • scipy ==1.7.1
  • seaborn ==0.11.2
  • send2trash ==1.8.0
  • setuptools ==58.0.4
  • simplegeneric ==0.8.1
  • singledispatch ==3.7.0
  • sip ==4.19.13
  • six ==1.16.0
  • snappy ==1.1.8
  • sniffio ==1.2.0
  • snowballstemmer ==2.1.0
  • sortedcollections ==2.1.0
  • sortedcontainers ==2.4.0
  • soupsieve ==2.2.1
  • sphinx ==4.2.0
  • sphinxcontrib ==1.0
  • sphinxcontrib-applehelp ==1.0.2
  • sphinxcontrib-devhelp ==1.0.2
  • sphinxcontrib-htmlhelp ==2.0.0
  • sphinxcontrib-jsmath ==1.0.1
  • sphinxcontrib-qthelp ==1.0.3
  • sphinxcontrib-serializinghtml ==1.1.5
  • sphinxcontrib-websupport ==1.2.4
  • spyder
  • spyder-kernels ==2.1.3
  • sqlalchemy ==1.4.22
  • sqlite ==3.36.0
  • statsmodels ==0.12.2
  • sympy ==1.9
  • tbb ==2021.4.0
  • tbb4py ==2021.4.0
  • tblib ==1.7.0
  • terminado ==0.9.4
  • testpath ==0.5.0
  • text-unidecode ==1.3
  • textdistance ==4.2.1
  • threadpoolctl ==2.2.0
  • three-merge ==0.1.1
  • tifffile ==2021.7.2
  • tinycss ==0.4
  • tk ==8.6.11
  • toml ==0.10.2
  • toolz ==0.11.1
  • tornado ==6.1
  • tqdm ==4.62.3
  • traitlets ==5.1.0
  • typed-ast ==1.4.3
  • typing_extensions ==3.10.0.2
  • tzdata ==2021e
  • ujson ==4.0.2
  • unicodecsv ==0.14.1
  • unidecode ==1.2.0
  • urllib3 ==1.26.7
  • vc ==14.2
  • vs2015_runtime ==14.27.29016
  • watchdog ==2.1.3
  • wcwidth ==0.2.5
  • webencodings ==0.5.1
  • werkzeug ==2.0.2
  • wheel ==0.37.0
  • whichcraft ==0.6.1
  • widgetsnbextension ==3.5.1
  • win_inet_pton ==1.1.0
  • win_unicode_console ==0.5
  • wincertstore ==0.2
  • winpty ==0.4.3
  • wrapt ==1.12.1
  • xlrd ==2.0.1
  • xlsxwriter ==3.0.1
  • xlwings ==0.24.9
  • xlwt ==1.3.0
  • xmltodict ==0.12.0
  • xz ==5.2.5
  • yaml ==0.2.5
  • yapf ==0.31.0
  • zfp ==0.5.5
  • zict ==2.0.0
  • zipp ==3.6.0
  • zlib ==1.2.11
  • zope ==1.0
  • zope.event ==4.5.0
  • zope.interface ==5.4.0
  • zstd ==1.4.9