surveyvoi

Survey value of information

https://github.com/prioritizr/surveyvoi

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
○
Academic publication links
✓
Committers with academic emails
2 of 3 committers (66.7%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.2%) to scientific vocabulary

Keywords from Contributors

geos

Last synced: 7 months ago · JSON representation

Repository

Survey value of information

Basic Info

Host: GitHub
Owner: prioritizr
Language: R
Default Branch: master
Homepage: https://prioritizr.github.io/surveyvoi
Size: 22.8 MB

Statistics

Stars: 5
Watchers: 1
Forks: 1
Open Issues: 0
Releases: 2

Created about 6 years ago · Last pushed 12 months ago

Metadata Files

Readme Changelog

README.Rmd

---
output:
  rmarkdown::github_document:
    html_preview: no
---



# surveyvoi: Survey Value of Information

[![lifecycle](https://img.shields.io/badge/Lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html)
[![R-CMD-check-Ubuntu](https://img.shields.io/github/actions/workflow/status/prioritizr/surveyvoi/R-CMD-check-ubuntu.yaml?branch=master&label=Ubuntu)](https://github.com/prioritizr/surveyvoi/actions)
[![R-CMD-check-Windows](https://img.shields.io/github/actions/workflow/status/prioritizr/surveyvoi/R-CMD-check-windows.yaml?branch=master&label=Windows)](https://github.com/prioritizr/surveyvoi/actions)
[![R-CMD-check-macOS](https://img.shields.io/github/actions/workflow/status/prioritizr/surveyvoi/R-CMD-check-macos.yaml?branch=master&label=macOS)](https://github.com/prioritizr/surveyvoi/actions)
[![R-CMD-check-fedora](https://img.shields.io/github/actions/workflow/status/prioritizr/surveyvoi/R-CMD-check-fedora.yaml?branch=master&label=Fedora)](https://github.com/prioritizr/surveyvoi/actions)
[![Documentation](https://img.shields.io/github/actions/workflow/status/prioritizr/surveyvoi/documentation.yaml?branch=master&label=Documentation)](https://github.com/prioritizr/surveyvoi/actions)
[![Coverage Status](https://img.shields.io/codecov/c/github/prioritizr/surveyvoi?label=Coverage)](https://app.codecov.io/gh/prioritizr/surveyvoi/branch/master)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/surveyvoi)](https://CRAN.R-project.org/package=surveyvoi)

```{r "knitr_config", include = FALSE}
knitr::opts_chunk$set(fig.path = "man/figures/README-", fig.align = "center")
```

```{r "initialization", include = FALSE}
devtools::load_all()
h = 3.5
w = 5.0
ow = "350"
knitr::opts_chunk$set(
  fig.height = h,
  fig.width = w,
  out.width = ow,
  fig.path = "man/figures/README-",
  fig.align = "center"
)
```

The _surveyvoi_ package is a decision support tool for prioritizing sites for ecological surveys based on their potential to improve plans for conserving biodiversity (e.g. plans for establishing protected areas). Given a set of sites that could potentially be acquired for conservation management -- wherein some sites have previously been surveyed and other sites have not -- this package provides functionality to generate and evaluate plans for additional surveys. Specifically, plans for ecological surveys can be generated using various conventional approaches (e.g. maximizing expected species richness, geographic coverage, diversity of sampled environmental conditions) and by maximizing value of information. After generating plans for surveys, they can also be evaluated using value of information analysis. Please note that several functions depend on the Gurobi optimization software (available from https://www.gurobi.com). Additionally, the JAGS software (available from https://mcmc-jags.sourceforge.io/) is required to fit hierarchical generalized linear models.

## Installation

The latest official version can be installed from the [Comprehensive R Archive Network (CRAN)](https://cran.r-project.org/) using the following _R_ code.

```{r, eval = FALSE}
install.packages("surveyvoi", repos = "https://cran.rstudio.com/")
```

Alternatively, the latest development version can be installed from [GitHub](https://github.com/prioritizr/surveyvoi) using the following code. Please note that while developmental versions may contain additional features not present in the official version, they may also contain coding errors.

```{r, eval = FALSE}
if (!require(remotes)) install.packages("remotes")
remotes::install_github("prioritizr/surveyvoi")
```

#### Windows

The [Rtools](https://cran.r-project.org/bin/windows/Rtools/) software needs to be installed to install the _surveyvoi R_ package from source. This software provides system requirements from [rwinlib](https://github.com/rwinlib/).

#### Ubuntu

The `gmp`, `fftw3`, `mpfr`, and `symphony` libraries need to be installed to install the _surveyvoi R_ package. Although the `fftw3` and `symphony` libraries are not used directly, they are needed to successfully install dependencies. For recent versions of Ubuntu (18.04 and later), these libraries are available through official repositories. They can be installed using the following system commands:

```
apt-get -y update
apt-get install -y libgmp3-dev libfftw3-dev libmpfr-dev coinor-libsymphony-dev
```

#### Linux

For Unix-alikes, `gmp` (>= 4.2.3), `mpfr` (>= 3.0.0), `fftw3` (>= 3.3), and `symphony` (>= 5.6.16) are required.

#### macOS

The `gmp`, `fftw`, `mpfr`, and `symphony` libraries are required. Although the `fftw3` and `symphony` libraries are not used directly, they are needed to successfully install dependencies. The easiest way to install these libraries is using [HomeBrew](https://brew.sh/). After installing HomeBrew, these libraries can be installed using the following commands in the system terminal:

```
brew tap coin-or-tools/coinor
brew install symphony
brew install pkg-config
brew install gmp
brew install fftw
brew install mpfr
```

## Citation

Please cite the _surveyvoi R_ package when using it in publications. To cite the package, please use:

> Hanson, JO, McCune JL, Chadès I, Proctor CA, Hudgins EJ, & Bennett JR (2023) Optimizing ecological surveys for conservation. Journal of Applied Ecology, 60: 41--51. Available at .

## Usage

Here we provide a short example showing how to use the _surveyvoi R_ package to prioritize funds for ecological surveys. In this example, we will generate plans for conducting ecological surveys (termed "survey schemes") using simulated data for six sites and three conservation features (e.g. bird species). To start off, we will set the seed for the random number generator for reproducibility and load some R packages.

```{r, message = FALSE}
set.seed(502)      # set RNG for reproducibility
library(surveyvoi) # package for value of information analysis
library(dplyr)     # package for preparing data
library(tidyr)     # package for preparing data
library(ggplot2)   # package for plotting data
```

Now we will load some datasets that are distributed with the package. First, we will load the `sim_sites` object. This spatially explicit dataset (i.e. `sf` object) contains information on the sites within our study area. Critically, it contains (i) sites that have already been surveyed, (ii) candidate sites for additional surveys, (iii) sites that have already been protected, and (iv) candidate sites that could be protected in the future. Each row corresponds to a different site, and each column describes different properties associated with each site. In this table, the `"management_cost"` column indicates the cost of protecting each site; `"survey_cost"` column indicates the cost of conducting an ecological survey within each site; and `"e1"` and `"e2"` columns contain environmental data for each site (not used in this example). The remaining columns describe the existing survey data and the spatial distribution of the features across the sites. The `"n1"`, `"n2"`, and `"n3"` columns indicate the number of surveys conducted within each site that looked for each of the three features (respectively); and `"f1"`, `"f2"`, and `"f3"` columns describe the proportion of surveys within each site that looked for each feature where the feature was detected (respectively). For example, if `"n1"` has a value of 2 and `"f1"` has a value of 0.5 for a given site, then the feature `"f1"` was detected in only one of the two surveys conducted in this site that looked for the feature. Finally, the `"p1"`, `"p2"`, and `"p3"` columns contain modeled probability estimates of each species being present in each site (see `fit_hglm_occupancy_models()` and `fit_xgb_occupancy_models()` to generate such estimates for your own data).


```{r "load_site_data"}
# load data
data(sim_sites)

# print table
print(sim_sites, width = Inf)

```{r "management_cost_plot"}
# plot cost of protecting each site
ggplot(sim_sites) +
geom_sf(aes(color = management_cost), size = 4) +
ggtitle("management_cost") +
theme(legend.title = element_blank(), text = element_text(size = 16))
```

```{r "survey_cost_plot"}
# plot cost of conducting an additional survey in each site
# note that these costs are much lower than the protection costs
ggplot(sim_sites) +
geom_sf(aes(color = survey_cost), size = 4) +
ggtitle("survey_cost") +
theme(legend.title = element_blank(), text = element_text(size = 16))
```

```{r "n_plot", fig.height = h, fig.width = w * 2.4, out.width = "800"}
# plot survey data
## n1, n2, n3: number of surveys in each site that looked for each feature
sim_sites %>%
select(n1, n2, n3) %>%
gather(name, value, -geometry) %>%
ggplot() +
geom_sf(aes(color = value), size = 4) +
facet_wrap(~name, nrow = 1) +
theme(text = element_text(size = 16))
```

```{r "f_plot", fig.height = h, fig.width = w * 2.4, out.width = "800"}
# plot survey results
## f1, f2, f3: proportion of surveys in each site that looked for each feature
##             that detected the feature
sim_sites %>%
select(f1, f2, f3) %>%
gather(name, value, -geometry) %>%
ggplot() +
geom_sf(aes(color = value), size = 4) +
facet_wrap(~name, nrow = 1) +
scale_color_continuous(limits = c(0, 1)) +
theme(text = element_text(size = 16))
```

```{r "p_plot", fig.height = h, fig.width = w * 2.4, out.width = "800"}
# plot modeled probability of occupancy data
sim_sites %>%
select(p1, p2, p3) %>%
gather(name, value, -geometry) %>%
ggplot() +
geom_sf(aes(color = value), size = 4) +
facet_wrap(~name, nrow = 1) +
scale_color_continuous(limits = c(0, 1)) +
theme(text = element_text(size = 16))
```

Next, we will load the `sim_features` object. This table contains information on the conservation features (e.g. species). Specifically, each row corresponds to a different feature, and each column contains information associated with the features. In this table, the `"name"` column contains the name of each feature; `"survey"` column indicates whether future surveys would look for this species; `"survey_sensitivity"` and `"survey_specificity"` columns denote the sensitivity (true positive rate) and specificity (true negative rate) for the survey methodology for correctly detecting the feature;  `"model_sensitivity"` and `"model_specificity"` columns denote the sensitivity (true positive rate) and specificity (true negative rate) for the species distribution models fitted for each feature; and `"target"` column denotes the required number of protected sites for each feature (termed "representation target", each feature has a target of `r sim_features$target[1]` site).

```{r "load_feature_data"}
# load data
data(sim_features)

# print table
print(sim_features, width = Inf)
```

After loading the data, we will now generate an optimized ecological survey scheme. To achieve this, we will use `approx_optimal_survey_scheme()` function. This function uses a greedy heuristic algorithm to maximize value of information. Although other functions can return solutions that are guaranteed to be optimal (i.e. `optimal_survey_scheme()`), they can take a very long time to complete because they use a brute-force approach. This function also uses an approximation routine to reduce computational burden.

To perform the optimization, we will set a total budget for (i) protecting sites and (ii) surveying sites. Although you might be hesitant to specify a budget, please recall that you would make very different plans depending on available funds. For instance, if you have near infinite funds then you wouldn't bother conducting any surveys and simply protect everything. Similarly, if you had very limited funds, then you wouldn't survey any sites to ensure that at least one site could be protected. Generally, conservation planning problems occur somewhere between these two extremes---but the optimization process can't take that into account if you don't specify a budget. For brevity, here we will set the total budget as 90% of the total costs for protecting sites.

```{r "generate_optimal_survey_scheme", message = FALSE, results = FALSE}
# calculate budget
budget <- sum(0.4 * sim_sites$management_cost)

# generate optimized survey scheme
opt_scheme <-
  approx_optimal_survey_scheme(
    site_data = sim_sites,
    feature_data = sim_features,
    site_detection_columns = c("f1", "f2", "f3"),
    site_n_surveys_columns = c("n1", "n2", "n3"),
    site_probability_columns = c("p1", "p2", "p3"),
    site_management_cost_column = "management_cost",
    site_survey_cost_column = "survey_cost",
    feature_survey_column = "survey",
    feature_survey_sensitivity_column = "survey_sensitivity",
    feature_survey_specificity_column = "survey_specificity",
    feature_model_sensitivity_column = "model_sensitivity",
    feature_model_specificity_column = "model_specificity",
    feature_target_column = "target",
    total_budget = budget,
    survey_budget = budget,
    verbose = TRUE
  )
```

```{r "validation", include = FALSE}
# ensure that at least one site is selected
assertthat::assert_that(
  sum(c(opt_scheme[1, ])) >= 1,
  msg = "no sites selected, not a compelling example"
)
```

```{r "survey_scheme_plot"}
# the opt_scheme object is a matrix that contains the survey schemes
# each column corresponds to a different site,
# and each row corresponds to a different solution
# in the event that there are multiple near-optimal survey schemes, then this
# matrix will have multiple rows
print(str(opt_scheme))

# let's add the first solution (row) in opt_scheme to the site data to plot it
sim_sites$scheme <- c(opt_scheme[1, ])

# plot scheme
# TRUE = selected for an additional ecological survey
# FALSE = not selected
ggplot(sim_sites) +
geom_sf(aes(color = scheme), size = 4) +
ggtitle("scheme") +
theme(text = element_text(size = 16))
```

This has just been a taster of the _surveyvoi R_ package. In addition to this functionality, it can be used to evaluate survey schemes using value of information analysis. Furthermore, it can be used to generate survey schemes using conventional approaches (e.g. sampling environmental gradients, and selecting places with highly uncertain information). For more information, see the [package vignette](https://prioritizr.github.io/surveyvoi/articles/surveyvoi.html).

## Getting help

If you have any questions about using the _surveyvoi R_ package or suggestions for improving it, please [file an issue at the package's online code repository](https://github.com/prioritizr/surveyvoi/issues/new).

Owner

Name: prioritizr
Login: prioritizr
Kind: organization

Repositories: 23
Profile: https://github.com/prioritizr

GitHub Events

Total

Watch event: 1
Delete event: 1
Issue comment event: 1
Push event: 5
Pull request event: 2
Create event: 1

Last Year

Watch event: 1
Delete event: 1
Issue comment event: 1
Push event: 5
Pull request event: 2
Create event: 1

Committers

Last synced: 12 months ago

All Time

Total Commits: 295
Total Committers: 3
Avg Commits per committer: 98.333
Development Distribution Score (DDS): 0.01

Past Year

Commits: 8
Committers: 1
Avg Commits per committer: 8.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
jeffreyhanson	j**n@u**u	292
Emma Hudgins	e**s@m**a	2
Jeroen Ooms	j**s@g**m	1

Committer Domains (Top 20 + Academic)

mail.mcgill.ca: 1 uqconnect.edu.au: 1

Issues and Pull Requests

Last synced: 7 months ago

All Time

Total issues: 33
Total pull requests: 22
Average time to close issues: about 2 months
Average time to close pull requests: 16 days
Total issue authors: 1
Total pull request authors: 2
Average comments per issue: 0.27
Average comments per pull request: 0.64
Merged pull requests: 20
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 4
Average time to close issues: N/A
Average time to close pull requests: 1 day
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.75
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

jeffreyhanson (33)

Pull Request Authors

jeffreyhanson (26)
jeroen (2)

Top Labels

Issue Labels

enhancement (5) wontfix (2) documentation (1)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- cran 654 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 6
Total maintainers: 1

cran.r-project.org: surveyvoi

Survey Value of Information

Homepage: https://prioritizr.github.io/surveyvoi/
Documentation: http://cran.r-project.org/web/packages/surveyvoi/surveyvoi.pdf
License: GPL-3
Latest release: 1.1.1
published 12 months ago

Versions: 6
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 654 Last month

Rankings

Forks count: 21.9%

Stargazers count: 28.5%

Dependent packages count: 29.8%

Average: 30.0%

Downloads: 34.5%

Dependent repos count: 35.5%

Maintainers (1)

jeffrey.hanson@uqconnect.edu.au

Last synced: 7 months ago

Dependencies

DESCRIPTION cran

Matrix * depends
R >= 4.0.0 depends
nloptr >= 1.2.2.2 depends
sf >= 0.8.0 depends
Rcpp >= 0.12.19 imports
RcppAlgos >= 2.3.6 imports
Rsymphony >= 0.1 imports
assertthat >= 0.2.0 imports
doParallel >= 1.0.15 imports
dplyr >= 0.8.3 imports
groupdata2 >= 1.3.0 imports
methods * imports
parallel * imports
plyr >= 1.8.4 imports
progress >= 1.2.2 imports
scales >= 1.0.0 imports
stats * imports
tibble >= 2.1.3 imports
utils * imports
vegan >= 2.5 imports
withr >= 2.1.2 imports
xgboost >= 1.5.2.1 imports
PoissonBinomial >= 1.1.1 suggests
Rmpfr >= 0.8 suggests
ggplot2 >= 3.2.1 suggests
gridExtra >= 2.3 suggests
gurobi >= 8.1.0 suggests
knitr >= 1.20 suggests
rmarkdown >= 1.10 suggests
roxygen2 >= 6.1.0 suggests
runjags >= 2.0.4.6 suggests
testthat >= 3.1.2 suggests
tidyr >= 1.0.0 suggests
viridis >= 0.5.1 suggests

.github/workflows/R-CMD-check-fedora.yaml actions

actions/checkout v2 composite

.github/workflows/R-CMD-check-macos.yaml actions

actions/cache v2 composite
actions/checkout v2 composite
actions/upload-artifact main composite
r-lib/actions/setup-pandoc v1 composite
r-lib/actions/setup-r v1 composite

.github/workflows/R-CMD-check-ubuntu.yaml actions

actions/cache v2 composite
actions/checkout v2 composite
actions/upload-artifact main composite
r-lib/actions/setup-pandoc v1 composite
r-lib/actions/setup-r v1 composite

.github/workflows/R-CMD-check-windows.yaml actions

actions/cache v2 composite
actions/checkout v2 composite
actions/upload-artifact main composite
r-lib/actions/setup-pandoc v1 composite
r-lib/actions/setup-r v1 composite

.github/workflows/documentation.yaml actions

actions/cache v2 composite
actions/checkout v2 composite
r-lib/actions/setup-pandoc v1 composite
r-lib/actions/setup-r v1 composite

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

surveyvoi

Science Score: 49.0%

Keywords from Contributors

Repository

Basic Info

Statistics

Metadata Files

README.Rmd

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: surveyvoi

Rankings

Maintainers (1)

Dependencies