https://github.com/coecms/hybrid_downscaling

Downscaling climate model using GPU-enabled machine learning

https://github.com/coecms/hybrid_downscaling

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.8%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Downscaling climate model using GPU-enabled machine learning

Basic Info
  • Host: GitHub
  • Owner: coecms
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 1.4 MB
Statistics
  • Stars: 4
  • Watchers: 7
  • Forks: 0
  • Open Issues: 2
  • Releases: 1
Created almost 4 years ago · Last pushed over 2 years ago
Metadata Files
Readme License

README.md

Hybrid Downscaling

DOI

Downscaling climate model using GPU-enabled machine learning.

Updated: 07/11/22

Paper info:

This code is used for the following paper:

Sanaa Hobeichi, Nidhi Nishant, Yawen Shao, Gab Abramowitz, Andy Pitman, Steve Sherwood, Craig Bishop and Samuel Green. Using Machine Learning to Cut the Cost of Dynamical Downscaling. Accepted in Earth's Future, 2023

Code:

Two machine learning methods were used to approach this task. The first method uses a Multi layer perceptron (MLP) model using pytorch. The second method uses a multivariate linear regression model and Random Forest (MLR_RF) using Scikit-learn.

The serial version of both methods work fine and have comparable results, they're just slow to run. The pytorch method has received the most recent development in terms of speeding-up but the scikit-learn method will be worked on in the future using RAPIDs.

Running MLP in parallel


The MLP model has been improved to run in parallel on 1 GPU (muliptle GPUs a work-in-progress), this allows 4 grids to run at the same time allowing for a x4 increase in speed.

mainMLPparallel.py is the main code for this and can be run as normal: bash python3 main_MLP_parallel

The main file calls other files config.py, functions.py, mlp_model.py, and train.py.

Python Modules needed: - torch - pandas - numpy - operator - hydroeval

The script takes advantage of the torch.multiprocessing library to parallelise the loop over multiple grid cells.

The script reads data from

/g/data/w97/sho561/Downscale/BARRA/Training_Testing_new/ And creates new files in

/g/data/w97/sho561/Downscale/BARRA/Models_new/ /g/data/w97/sho561/Downscale/BARRA/Prediction_Evaluation_new/

Running MLP in serial


Main scripts:

  • mainMLPserial.py: trains a Multi layer perceptron model for every grid, uses 20 grids only. It uses a GPU version of PyTorch and it needs to run with GPU.

  • mainMLPtransfer.py: uses the saved model created in ‘main_MLP.py’ to predict on new data. Uses GPU since it's using PyTorch-GPU.

  • mainMLR2RF.py: trains a multivariate linear regression model (Scikit-learn) and Random Forest (Scikit-learn) for every grid, uses 20 grids. Uses CPUs.

  • mainMLR2RFtransfer.py: uses the saved models created in ‘mainMLR2_RF.py’ to predict on new data. Uses CPUs.


Owner

  • Name: ARC COE for Climate Extremes: Computational Modelling Systems
  • Login: coecms
  • Kind: organization

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1