ml_drought

Machine learning to better predict and understand drought. Moving github.com/ml-clim

https://github.com/ecmwfcode4earth/ml_drought

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 3 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.8%) to scientific vocabulary

Keywords

2019 copernicus machine-learning
Last synced: 9 months ago · JSON representation

Repository

Machine learning to better predict and understand drought. Moving github.com/ml-clim

Basic Info
Statistics
  • Stars: 93
  • Watchers: 7
  • Forks: 19
  • Open Issues: 42
  • Releases: 0
Topics
2019 copernicus machine-learning
Created about 7 years ago · Last pushed about 4 years ago
Metadata Files
Readme

README.md

Build Status

Open In Colab

A Machine Learning Pipeline for Climate Science

This repository is an end-to-end pipeline for the creation, intercomparison and evaluation of machine learning methods in climate science.

The pipeline carries out a number of tasks to create a unified-data format for training and testing machine learning methods.

These tasks are split into the different classes defined in the src folder and explained further below:

NOTE: some basic working knowledge of Python is required to use this pipeline, although it is not too onerous

Using the Pipeline

There are three entrypoints to the pipeline: * run.py * notebooks * scripts

A blog post describing the goals and design of the pipeline can be found here.

View the initial presentation of our pipeline here.

Setup

Anaconda running python 3.7 is used as the package manager. To get set up with an environment, install Anaconda from the link above, and (from this directory) run

bash conda env create -f environment.yml This will create an environment named esowc-drought with all the necessary packages to run the code. To activate this environment, run

bash conda activate esowc-drought

Docker can also be used to run this code. To do this, first run the docker app (either docker desktop) or configure the docker-machine:

```bash

on macOS

brew install docker-machine docker

docker-machine create --driver virtualbox default docker-machine env default ``` See here for help on all machines or here for MacOS.

Then build the docker image:

bash docker build -t ml_drought .

Then, use it to run a container, mounting the data folder to the container:

bash docker run -it \ --mount type=bind,source=<PATH_TO_DATA>,target=/ml_drought/data \ ml_drought /bin/bash

You will also need to create a .cdsapirc file with the following information: bash url: https://cds.climate.copernicus.eu/api/v2 key: <INSERT KEY HERE> verify: 1

Testing

This pipeline can be tested by running pytest. flake8 is used for linting.

We use mypy for type checking. This can be run by running mypy src (this runs mypy on the src directory).

We use black for code formatting.

Team: @tommylees112, @gabrieltseng

For updates follow @tommylees112 on twitter or look out for our blog posts!

Acknowledgements

This was a project completed as part of the ECMWF Summer of Weather Code Challenge #12. The challenge was setup to use ECMWF/Copernicus open datasets to evaluate machine learning techniques for the prediction of droughts.

Huge thanks to @ECMWF for making this project possible!

Owner

  • Name: ECMWF Code for Earth
  • Login: ECMWFCode4Earth
  • Kind: organization
  • Location: Online

ECMWF Code for Earth is a collaborative programme where each summer several developer teams work on innovative earth sciences-related software.

GitHub Events

Total
  • Watch event: 1
  • Fork event: 1
Last Year
  • Watch event: 1
  • Fork event: 1

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 248
  • Total Committers: 3
  • Avg Commits per committer: 82.667
  • Development Distribution Score (DDS): 0.46
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
tommylees112 t****2@g****m 134
Gabriel Tseng g****g@m****a 113
Julia Wagemann w****a@g****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 41
  • Total pull requests: 131
  • Average time to close issues: 2 months
  • Average time to close pull requests: 17 days
  • Total issue authors: 8
  • Total pull request authors: 2
  • Average comments per issue: 1.1
  • Average comments per pull request: 0.79
  • Merged pull requests: 108
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • tommylees112 (25)
  • cvitolo (5)
  • jwagemann (4)
  • gabrieltseng (2)
  • shaunharrigan (2)
  • rpitonak (1)
  • v2thegreat (1)
  • AlineBornschein (1)
Pull Request Authors
  • gabrieltseng (74)
  • tommylees112 (57)
Top Labels
Issue Labels
review (11) export (2) model validation (2) pipeline entrypoint (1) analysis (1)
Pull Request Labels
modelling (33) wip (16) preprocess (14) feature engineering (13) export (12) model validation (12) analysis (7) documentation (4) pipeline entrypoint (1)

Dependencies

Dockerfile docker
  • continuumio/miniconda3 latest build
environment.yml pypi