westus_peff
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.8%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: mdfahimhasan
- Language: Jupyter Notebook
- Default Branch: master
- Size: 97.9 MB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
Physics-constrained effective precipitation model for the Western United States
Abstract
Effective precipitation, defined as the portion of evapotranspiration (ET) derived from precipitation, is an important part of the agricultural water balance and affects the amount of water required for irrigation. Due to hydrologic complexity, effective precipitation is challenging to quantify and validate using existing empirical and process-based methods. Moreover, there is no readily available high-resolution effective precipitation dataset for the United States (US), despite its importance for determining requirements and consumptive use of irrigation water. In this study, we developed a framework that incorporates multiple hydrologic states and fluxes within a two-step machine learning approach that accurately predicts effective precipitation for irrigated croplands of the Western US at ~2 km spatial resolution and monthly time steps from 2000 to 2020. Additionally, we analyzed the factors influencing effective precipitation to understand its dynamics in irrigated landscapes. To further assess effective precipitation estimates, we estimated groundwater pumping for irrigation in seven basins of the Western US with a water balance model that incorporates model-generated effective precipitation. A comparison of our estimated pumping volumes with in-situ records indicates good skill, with R2 of 0.78 and PBIAS of –15%. Though challenges remain in predicting and assessing effective precipitation, the satisfactory performance of our model and good skill in estimated pumping illustrate the application and potential of integrating satellite data and machine learning with a physically-based water balance to estimate key water fluxes. The effective precipitation dataset developed in this study has the potential to be used with satellite-based actual ET data for estimating consumptive use of irrigation water at large temporal and spatial scales and enable best available science-informed water management decisions.
Keywords: Effective precipitation, Groundwater; Irrigation; Water use; Remote sensing; Machine learning.
Effective precipitation map
Figure: Machine learning model generated monthly effective precipitation estimates from 2010 to 2014 at 2 km spatial resolution.
Citations
- Hasan, M.F., Smith, R.G., Majumdar, S., Huntington, J.L., & Neto, A.A.M. (2025). Satellite Data and Physics-Informed Machine Learning for Estimating Effective Precipitation in the Western United States and Application for Monitoring Groundwater Irrigation. In prep. for AGU Water Resources Research.
- Majumdar, S., Smith, R.G., Hasan, M.F., Wogenstahl, C., & Conway, B.D. (2025). A long-term database of groundwater pumping, consumptive use, effective precipitation, and irrigation efficiencies in Arizona derived from remote sensing and machine learning. In prep. for Nature Scientific Data.
Organizations

Funding

Running the repository
Repository structure
The repository has five main modules described as follows-
1. utils - consists of scripts that helps in basic raster, vector, and statistical operation. It also holds the ml_ops scripts which has the machine learning functions.
2. datadownloadpreprocess - consists of scripts that have functions to download datasets from GEE, including OpenET, and to further pre-process the datasets. The run_download_preprocess.py is the main driver script that has to be used to download and pre-process all required datasets.
3. effective_precip - consists of functions that are required specifically for the effective precipitation model. The effective precipitation is estimated by a 3-step model. First, the m01_peff_model_monthly.py script estmates effective precipitation at monthly scale. The monthly estimates do not follow water balance (water year precipitation > water year effective precipitation) in some regions. So, at the second step, the m02_peff_frac_model_water_yr.py script simulates a water year-scale effective precipitation fraction model. This water year-scale model is used to impose water balance over the monthly estimates using the m03_peff_adjust.py script. These three files have to be run in sequence to generate the monthly effective precipitation estimates.
4. sw_irrig - consists of functions that are required for dictributing USGS HUC12 level surface water irrigation data to 2 km pixel scale. The SW_Irr.py is the main driver file.
5. netGW - consists of the netGW_Irr.py script that has the functions to estimate consumptive groundwter use for irrigation at 2 km resolution using a water balance appraoch.
The utils module do not need any execution. The latter modules are required to be executed using the respective driver files to unvail the full funtionality of the model. The repository has other auxiliary folders with scripts that are used for some data processing, result analysis,and plotting purposes.
Dependencies
operating system: Most scripts are fully functional in windows and linux environments except some. In linux environment, gdal needs to be installed separately and the appropriate 'gdalpath' needs to be set in necessary scripts. For some functions, e.g. the `shapefiletoraster()inutils > rasterops.pyand associated scripts (resultsanalysis > netGWpumpingcompile.py), gdal system call has to enabled/installed specifically to run them in linux environment. Note that all scripts, except the scripts inresultsanalysismodule, have been implemented/checked using both windows and linux environment (using conda environment). In addition, the ALE plot generation inm01peffmodelmonthly.pyandm02pefffracmodelwateryr.pyscripts do not respond (keep running indifinitely) in linux environment (probably due to scikit-explain versioning issue); therefore, setskipplotale = True` when running the monthly and water year models in linux environment.
The authors recommend exercising discretion when setting up the environment and run the scripts.
conda environment: A conda environment, set up using Anaconda with python 3.9, has been used to implement this repositories. Required libraries needed to be installed to run this repository are - dask, dask-geopandas, earthengine-api, fastparquet, rasterio, gdal, shapely, geopandas, numpy, pandas, scikit-learn, lightgbm, scikit-explain, matplotlib, seaborn.
Note that running the .ipynb scripts will require installaion of jupyter lab within the conda environment.
Google Earth Engine authentication
This project relies on the Google Earth Engine (GEE) Python API for downloading (and reducing) some of the predictor datasets from the GEE
data repository. After completing step 3, run earthengine authenticate. The installation and authentication guide
for the earth-engine Python API is available here. The Google Cloud CLI tools
may be required for this GEE authentication step. Refer to the installation docs here. You also have to create a gcloud project to use the GEE API.
Data availability
The monthly effective precipitation estimates for all months of 2000 to 2020 (up to September in 2020) are available to download through this GEE script. Non-GEE users can acccess the dataset from this HydroShare repo.
Owner
- Name: Md Fahim Hasan
- Login: mdfahimhasan
- Kind: user
- Location: Fort Collins
- Company: Colorado State University
- Website: https://www.linkedin.com/in/md-fahim-hasan/
- Twitter: hasanfahim004
- Repositories: 1
- Profile: https://github.com/mdfahimhasan
Citation (CITATION.cff)
cff-version: 1.1.0
message: If you use this software, please cite it as below.
title: mdfahimhasan/WestUS_Peff: Effective precipitation model-Western United States
version: Peff_v1
date-released: 2025-04-04
doi: 10.5281/zenodo.15151229
authors:
- family-names: Hasan
given-names: Md Fahim
- family-names: Smith
given-names: Ryan
- family-names: Majumdar
given-names: Sayantan
- family-names: Huntington
given-names: Justin
- family-names: Neto
given-names: Antonio Alves Meira
GitHub Events
Total
- Release event: 2
- Push event: 22
- Create event: 1
Last Year
- Release event: 2
- Push event: 22
- Create event: 1