Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.8%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: Jonnathan-Landi
  • Language: R
  • Default Branch: main
  • Size: 1.47 MB
Statistics
  • Stars: 3
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created over 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme Changelog

README.md

RFplus: A Novel Machine Learning Approach for Merging Multi-Satellite Precipitation Products and Ground Observations

R-CMD-check CRAN status CRAN RStudio mirror downloads Codecov test coverage

Overview

The RFplus package implements a novel spatial extrapolation and bias correction framework, integrating Random Forest (RF) and Quantile Mapping (QM) in a multi-stage process to improve the accuracy of satellite precipitation estimates. The methodology consists of three key stages:

  1. Spatial Extrapolation of Precipitation: The first stage employs a Random Forest model to extrapolate the spatial distribution of precipitation. The model is trained using in-situ measurements as the response variable and a diverse set of satellite precipitation products and environmental covariates as predictors. This approach enables the generation of an initial precipitation field that extends observed precipitation patterns across unmonitored regions with high spatial flexibility, allowing applications at different temporal scales (e.g., daily, monthly, or annual).

  2. Residual Correction through a Secondary RF Model: To enhance predictive accuracy, a second Random Forest model is trained to estimate residual errors from the initial predictions. The residuals are defined as the difference between observed and modeled precipitation values at station locations. By modeling these residuals as a function of the same covariates used in the first stage, systematic biases are identified and corrected iteratively. The corrected precipitation estimate is obtained by summing the residual predictions to the initial RF-based precipitation estimates, leading to a refined precipitation product with reduced bias and improved spatial coherence.

  3. Bias Adjustment via Non-Parametric Quantile Mapping (QM): In the third stage, a nonparametric quantile mapping (QM) is applied to adapt the distribution of each time series to the in situ observations of the nearest station. The QM correction will be applied to those pixels that meet the proximity criterion, which states that only pixels within a predefined radius of influence (e.g., ≤15 km) are QM corrected.

The RFplus package is designed to be highly adaptable and can be utilized across various satellite precipitation products and geographic regions. Although initially developed for precipitation bias correction, its methodology is applicable to other environmental variables such as temperature, wind speed, and soil moisture. This versatility makes RFplus a powerful tool for enhancing the accuracy of remote sensing-based estimations across diverse environmental conditions.

Key Requirements for RFplus

To ensure the optimal performance of RFplus, the following requirements must be met:

  1. Data Quality: In situ station data must undergo rigorous quality control and homogenization before use.
  2. Stations should have less than 10% missing data to avoid skewing the quantile correction.
  3. Input Data Consistency:
    • Covariates must share the same spatial extent and coordinate reference system (CRS).
-   All covariates should have the same number of layers.
  1. Mandatory Use of DEM:
    • A DEM is required as an input parameter.
-   The DEM should have a single layer, which will be replicated to match the number of layers in other covariates.
  1. Covariate Structure:
    • Covariates should be provided as a list of raster layers.
-   The classes of all covariates must be consistent.

Installation

RFplus is available on CRAN, so, to get the current version you can use:

r install.packages("RFplus")

If you want to test the development version, you can do it from GitHub:

r if (!require(devtools)) install.packages("devtools") library(devtools) install_github("Jonnathan-Landi/RFplus")

Example using RFplus

Example of a RFplus run using the example data.

``` r library(RFplus) library(terra) library(data.table)

Load the data

data("BDInsitu", package = "RFplus") data("CordsInsitu", package = "RFplus")

Load the covariates

Covariates <- list( MSWEP = terra::rast(system.file("extdata/MSWEP.nc", package = "RFplus")), CHIRPS = terra::rast(system.file("extdata/CHIRPS.nc", package = "RFplus")), DEM = terra::rast(system.file("extdata/DEM.nc", package = "RFplus")) )

Apply the RFplus bias correction model

model = RFplus(BDInsitu, CordsInsitu, Covariates, nround = 1, wet.day = 0.1, ntree = 2000, seed = 123, training = 0.8, Rainthreshold = 0.1, method = "RQUANT", ratio = 5, savemodel = FALSE, namesave = NULL )

Visualize the results

Precipitation results within the study area

modelo_rainfall = model$Ensamble

Validation statistic results

metrics = model$Validation

Note: In the above example we used 80% of the data for training and 20% for # model validation.

```

Limitations and Future Research

Currently, the RFplus algorithm does not support parallel mode execution, which means that, when working with large volumes of data, the processing time can be long. Research is currently underway on how to integrate parallel execution capability to improve efficiency and reduce computation time in future uses of the algorithm.

Owner

  • Name: Jonnathan
  • Login: Jonnathan-Landi
  • Kind: user
  • Location: Cuenca, Ecuador
  • Company: EnviroLand-Tech

I am an environmental engineer, with a background in data science, hydroinformatics, hydrology, etc.

GitHub Events

Total
  • Release event: 2
  • Watch event: 3
  • Public event: 1
  • Push event: 39
  • Create event: 2
Last Year
  • Release event: 2
  • Watch event: 3
  • Public event: 1
  • Push event: 39
  • Create event: 2

Packages

  • Total packages: 1
  • Total downloads:
    • cran 210 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 4
  • Total maintainers: 1
cran.r-project.org: RFplus

Machine Learning for Merging Satellite and Ground Precipitation Data

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 210 Last month
Rankings
Dependent packages count: 27.3%
Dependent repos count: 33.6%
Average: 49.2%
Downloads: 86.8%
Maintainers (1)
Last synced: 10 months ago