SDMtune

Performs Variables selection and model tuning for Species Distribution Models (SDMs). It provides also several utilities to display results.

https://github.com/consbiol-unibern/sdmtune

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (22.3%) to scientific vocabulary

Keywords

hyperparameter-tuning species-distribution-modelling variable-selection
Last synced: 6 months ago · JSON representation

Repository

Performs Variables selection and model tuning for Species Distribution Models (SDMs). It provides also several utilities to display results.

Basic Info
Statistics
  • Stars: 27
  • Watchers: 4
  • Forks: 10
  • Open Issues: 15
  • Releases: 16
Topics
hyperparameter-tuning species-distribution-modelling variable-selection
Created over 7 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct

README.Rmd

---
output:
  github_document
bibliography: ./vignettes/SDMtune.bib
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "docs/reference/figures/README-"
)
```

# SDMtune 

[![R build status](https://github.com/ConsBiol-unibern/SDMtune/workflows/R-CMD-check/badge.svg)](https://github.com/ConsBiol-unibern/SDMtune/actions)
[![CRAN Status](https://www.r-pkg.org/badges/version-last-release/SDMtune)](https://cran.r-project.org/package=SDMtune)
[![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/grand-total/SDMtune)](https://www.r-pkg.org/pkg/SDMtune)
[![Coverage status](https://codecov.io/gh/ConsBiol-unibern/SDMtune/branch/master/graph/badge.svg)](https://app.codecov.io/github/ConsBiol-unibern/SDMtune?branch=master)


**SDMtune** provides a user-friendly framework that enables the training and the evaluation of species distribution models (SDMs). The package implements functions for data driven variable selection and model tuning and includes numerous utilities to display the results. All the functions used to select variables or to tune model hyperparameters have an interactive real-time chart displayed in the RStudio viewer pane during their execution.
Visit the [package website](https://consbiol-unibern.github.io/SDMtune/) and learn how to use **SDMtune** starting from the first article [Prepare data for the analysis](https://consbiol-unibern.github.io/SDMtune/articles/prepare-data.html).

## Installation

You can install the latest release version from CRAN:

```{r cran-installation, eval = FALSE}
install.packages("SDMtune")
```

Or the development version from GitHub:

```{r gh-installation, eval = FALSE}
devtools::install_github("ConsBiol-unibern/SDMtune")
```

## Hyperparameters tuning & real-time charts

**SDMtune** implements three functions for hyperparameters tuning:

* `gridSearch`: runs all the possible combinations of predefined hyperparameters' values;
* `randomSearch`: randomly selects a fraction of the possible combinations of predefined hyperparameters' values;
* `optimizeModel`: uses a *genetic algorithm* that aims to optimize the given evaluation metric by combining the predefined hyperparameters' values.

When the amount of hyperparameters' combinations is high, the computation time necessary to train all the defined models could be very long. The function `optimizeModel` offers a valid alternative that reduces computation time thanks to an implemented *genetic algorithm*. This function seeks the best combination of hyperparameters reaching a near optimal or optimal solution in a reduced amount of time compared to `gridSearch`. The following code shows an example using a simulated dataset. First a model is trained using the **Maxnet** algorithm implemented in the `maxnet` package with default hyperparameters' values. After the model is trained, both the `gridSearch` and `optimizeModel` functions are executed to compare the execution time and model performance evaluated with the AUC metric. If the following code is not clear, please check the articles in the [website](https://consbiol-unibern.github.io/SDMtune/).

```{r example, eval=FALSE}
library(SDMtune)

# Acquire environmental variables
files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
                    pattern = "grd", full.names = TRUE)
predictors <- terra::rast(files)

# Prepare presence and background locations
p_coords <- virtualSp$presence
bg_coords <- virtualSp$background

# Create SWD object
data <- prepareSWD(species = "Virtual species", p = p_coords, a = bg_coords,
                   env = predictors, categorical = "biome")

# Split presence locations in training (80%) and testing (20%) datasets
datasets <- trainValTest(data, test = 0.2, only_presence = TRUE, seed = 25)
train <- datasets[[1]]
test <- datasets[[2]]

# Train a Maxnet model
model <- train(method = "Maxnet", data = train)

# Define the hyperparameters to test
h <- list(reg = seq(0.1, 3, 0.1), fc = c("lq", "lh", "lqp", "lqph", "lqpht"))

# Test all the possible combinations with gridSearch
gs <- gridSearch(model, hypers = h, metric = "auc", test = test)
head(gs@results[order(-gs@results$test_AUC), ])  # Best combinations

# Use the genetic algorithm instead with optimizeModel
om <- optimizeModel(model, hypers = h, metric = "auc", test = test, seed = 4)
head(om@results)  # Best combinations
```

During the execution of "tuning" and "variable selection" functions, real-time charts displaying training and validation metrics are displayed in the RStudio viewer pane (below is a screencast of the previous executed `optimizeModel` function).

## Speed test In the following example we train a **Maxent** model: ```{r train Maxent, eval=FALSE} # Train a Maxent model sdmtune_model <- train(method = "Maxent", data = data) ``` We compare the execution time of the `predict` function between **SDMtune** that uses its own algorithm and **dismo** [@Hijmans2017] that calls the MaxEnt Java software [@Phillips2006]. We first convert the object `sdmtune_model` in a object that is accepted by **dismo**: ```{r sdmtune2maxent, eval=FALSE} maxent_model <- SDMmodel2MaxEnt(sdmtune_model) ``` Next is a function used below to test if the results are equal, with a tolerance of `1e-7`: ```{r check, eval=FALSE} my_check <- function(values) { return(all.equal(values[[1]], values[[2]], tolerance = 1e-7)) } ``` Now we test the execution time using the **microbenckmark** package: ```{r bench, eval=FALSE} bench <- microbenchmark::microbenchmark( SDMtune = predict(sdmtune_model, data = data, type = "cloglog"), dismo = predict(maxent_model, data@data), check = my_check ) ``` and plot the output: ```{r plot bench, eval=FALSE} library(ggplot2) ggplot(bench, aes(x = expr, y = time/1000000, fill = expr)) + geom_boxplot() + labs(fill = "", x = "Package", y = "time (milliseconds)") + theme_minimal() ```
## Set working environment To train a **Maxent** model using the Java implementation you need that: * the **Java JDK** software is installed * the package **rJava** is installed You can check the version of MaxEnt used by `dismo` with the following command: ```{r maxent version, eval=FALSE} dismo::maxent() ``` The MaxEnt `jar` file used by `dismo` is located in the folder returned by the following command: ```{r dismo folder, eval=FALSE} system.file(package="dismo") ``` In case you want to upgrade to a newer version of MaxEnt (if available), download the file **maxent.jar** [here](https://biodiversityinformatics.amnh.org/open_source/maxent/) and replace the file already present in the previous folder. The function `checkMaxentInstallation` checks that Java JDK and rJava are installed, and that the file maxent.jar is in the correct folder. ```{r check Maxent installation, eval=FALSE} checkMaxentInstallation() ``` If everything is correctly configured for `dismo`, the command `dismo::maxent()` will return the new MaxEnt version. ## Code of conduct Please note that this project follows a [Contributor Code of Conduct](https://consbiol-unibern.github.io/SDMtune/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms. ### References

GitHub Events

Total
  • Issues event: 2
  • Watch event: 4
  • Issue comment event: 1
  • Push event: 16
  • Pull request event: 3
Last Year
  • Issues event: 2
  • Watch event: 4
  • Issue comment event: 1
  • Push event: 16
  • Pull request event: 3

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 1,405
  • Total Committers: 3
  • Avg Commits per committer: 468.333
  • Development Distribution Score (DDS): 0.349
Past Year
  • Commits: 161
  • Committers: 1
  • Avg Commits per committer: 161.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
sgvignali v****0@g****m 915
sgvignali 2****i 480
Vignali v****i@c****h 10
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 41
  • Total pull requests: 5
  • Average time to close issues: 2 months
  • Average time to close pull requests: 10 days
  • Total issue authors: 29
  • Total pull request authors: 4
  • Average comments per issue: 3.05
  • Average comments per pull request: 0.4
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: 20 days
  • Issue authors: 2
  • Pull request authors: 2
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.67
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • veroandreo (5)
  • DevinALyons (3)
  • nhill917 (2)
  • ptitle (2)
  • ManuelSpinola (2)
  • PetiteTong (2)
  • owenssam1 (2)
  • rogerio-bio (2)
  • obrienjm25 (1)
  • Zeroo11 (1)
  • Rogerio-7 (1)
  • ameliabridges (1)
  • anackr (1)
  • GolaDataExplorer (1)
  • sgvignali (1)
Pull Request Authors
  • lidefi87 (2)
  • teunbrand (2)
  • sgvignali (1)
  • SethMusker (1)
Top Labels
Issue Labels
bug (25) new feature (11)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 549 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 19
  • Total maintainers: 1
cran.r-project.org: SDMtune

Species Distribution Model Selection

  • Versions: 19
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 549 Last month
Rankings
Forks count: 10.9%
Stargazers count: 11.9%
Downloads: 17.7%
Average: 18.5%
Dependent repos count: 24.3%
Dependent packages count: 27.9%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.6.0 depends
  • Rcpp >= 1.0.1 imports
  • dismo >= 1.3 imports
  • gbm >= 2.1.5 imports
  • ggplot2 >= 3.3.1 imports
  • jsonlite >= 1.6 imports
  • maxnet >= 0.1.4 imports
  • methods * imports
  • nnet >= 7.3 imports
  • progress >= 1.2.2 imports
  • randomForest >= 4.6 imports
  • raster >= 2.9 imports
  • rlang >= 0.4.5 imports
  • rstudioapi >= 0.10 imports
  • stringr >= 1.4.0 imports
  • whisker >= 0.3 imports
  • cli >= 1.1.0 suggests
  • covr * suggests
  • crayon >= 1.3.4 suggests
  • htmltools >= 0.3.6 suggests
  • kableExtra >= 1.1.0 suggests
  • knitr >= 1.23 suggests
  • maps >= 3.3.0 suggests
  • pkgdown >= 1.5.0 suggests
  • plotROC >= 2.2.1 suggests
  • rJava >= 0.9 suggests
  • rasterVis >= 0.50 suggests
  • reshape2 >= 1.4.3 suggests
  • rgdal >= 1.4 suggests
  • rmarkdown >= 2.7 suggests
  • scales >= 1.0.0 suggests
  • testthat >= 3.0.4 suggests
  • zeallot >= 0.1.0 suggests
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action v4.4.1 composite
  • actions/checkout v3 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v4 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite