forestecology

forestecology R package of Methods and Data for Forest Ecology Model Fitting and Assessment

https://github.com/rudeboybert/forestecology

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
✓
Committers with academic emails
1 of 3 committers (33.3%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (20.1%) to scientific vocabulary

Keywords

forestecology tidyverse

Last synced: 9 months ago · JSON representation ·

Repository

forestecology R package of Methods and Data for Forest Ecology Model Fitting and Assessment

Basic Info

Host: GitHub
Owner: rudeboybert
License: other
Language: TeX
Default Branch: master
Homepage: https://rudeboybert.github.io/forestecology/
Size: 67.6 MB

Statistics

Stars: 12
Watchers: 4
Forks: 2
Open Issues: 4
Releases: 4

Topics

forestecology tidyverse

Created almost 8 years ago · Last pushed 10 months ago

Metadata Files

Readme Changelog License Code of conduct Citation

README.Rmd

---
output: github_document
---



```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%",
  fig.width = 16 / 2,
  fig.height = 9 / 2,
  message = FALSE,
  warning = FALSE,
  cache = FALSE
)
set.seed(76)
```

# forestecology 

[![DOI](https://zenodo.org/badge/144327964.svg)](https://zenodo.org/badge/latestdoi/144327964)
[![Build Status](https://github.com/rudeboybert/forestecology/workflows/R-CMD-check/badge.svg)](https://github.com/rudeboybert/forestecology/actions)
[![Codecov test coverage](https://codecov.io/gh/rudeboybert/forestecology/branch/master/graph/badge.svg)](https://codecov.io/gh/rudeboybert/forestecology?branch=master)
[![CRAN status](https://www.r-pkg.org/badges/version/forestecology)](https://cran.r-project.org/package=forestecology)


## Installation

You can install the released version of forestecology from [CRAN](https://CRAN.R-project.org) with: 

```{r, eval = FALSE}
install.packages("forestecology")
```

And the development version from [GitHub](https://github.com/) with:

```{r, eval = FALSE}
# install.packages("remotes")
remotes::install_github("rudeboybert/forestecology")
```

This package is designed to work for spatially mapped, repeat censused forests plots. The package has commands to fit models of tree growth based on neighborhood competition which can be used to estimate species-specific competition coefficients. The model fits can then be evaluated using a spatial cross-validation scheme to detect possible overfitting. Additionally, these models can test whether the species identity of competitors matters using a permutation test-style shuffling of competitor identity (under the null hypothesis) and subsequently evaluating if model performance changes. 

See Kim et al. (2021) [The `forestecology` R package for fitting and assessing neighborhood models of the effect of interspecific competition on the growth of trees](http://doi.org/10.1002/ece3.8129) for a full description; the source code for this paper can be found in the `paper/` folder. 


## Example analysis

We present an example analysis using toy data pre-loaded into the package where we will:

1. Compute growth of trees based on census data
1. Add spatial information
1. Identify all focal and corresponding competitor trees
1. Fit model and make predictions
1. Run spatial cross-validation

```{r}
library(tidyverse)
library(forestecology)
library(patchwork)
library(blockCV)

# Resolve conflicting functions
filter <- dplyr::filter
```

### Compute growth of trees based on census data

The starting point of our analysis are data from two repeat censuses `census_1_ex` and `census_2_ex`. For example, consider the forest census data in `census_1_ex`.

```{r}
census_1_ex
```

We convert the `census_1_ex` data frame to an object of type `sf` and then plot using `geom_sf()`. 

```{r}
ggplot() +
  geom_sf(
    data = census_1_ex %>% sf::st_as_sf(coords = c("gx", "gy")),
    aes(col = sp, size = dbh)
  )
```

We first combine data from two repeat censuses into a single `growth` data frame that has the average annual growth of all trees alive at both censuses that aren't resprouts at the second census per Allen and Kim (2020).

```{r}
growth_ex <-
  compute_growth(
    census_1 = census_1_ex %>% 
      mutate(sp = to_any_case(sp)),
    census_2 = census_2_ex %>% 
      filter(!str_detect(codes, "R")) %>% 
      mutate(sp = to_any_case(sp)),
    id = "ID"
  ) %>% 
  # Compute basal area:
  mutate(basal_area = 0.0001 * pi * (dbh1 / 2)^2)
```

### Add spatial information

Our growth model assumes that two individual trees compete if they are less than a pre-specified distance `comp_dist` apart. Furthermore, we define a buffer region of size `comp_dist` from the boundary of the study region.

```{r}
# Set competitor distance
comp_dist <- 1

# Add buffer variable to growth data frame
growth_ex <- growth_ex %>%
  add_buffer_variable(size = comp_dist, region = study_region_ex)

# Optional: Create sf representation of buffer region
buffer_region <- study_region_ex %>% 
  compute_buffer_region(size = comp_dist)
```

In the visualization below, the solid line represents the boundary of the study region while the dashed line delimits the buffer region within. All trees outside this buffer region (in red) will be our "focal" trees of interest in our model since we have complete competitor information on all of them. All trees inside this buffer region (in blue) will only be considered as "competitor" trees to "focal" trees.

```{r}
base_plot <- ggplot() +
  geom_sf(data = study_region_ex, fill = "transparent") +
  geom_sf(data = buffer_region, fill = "transparent", linetype = "dashed")

base_plot + 
  geom_sf(data = growth_ex, aes(col = buffer), size = 2)
```

Next we add information pertaining to our spatial cross-validation scheme. We first manually define the spatial blocks that will act as our cross-validation folds and convert them to an `sf` object using the `sf_polygon()` function from the `sfheaders` package.

```{r}
fold1 <- rbind(c(0, 0), c(5, 0), c(5, 5), c(0, 5), c(0, 0))
fold2 <- rbind(c(5, 0), c(10, 0), c(10, 5), c(5, 5), c(5, 0))

blocks_ex <- bind_rows(
  sf_polygon(fold1),
  sf_polygon(fold2)
) %>%
  mutate(folds = c(1, 2) %>% factor())
```

Next we assign each tree to the correct folds using the `foldID` variable of the output returned by the `spatialBlock()` function from the [`blockCV`](https://github.com/rvalavi/blockCV) package.

```{r}
SpatialBlock_ex <- blockCV::spatialBlock(
  speciesData = growth_ex, k = 2, selection = "systematic", blocks = blocks_ex,
  showBlocks = FALSE, verbose = FALSE
)

growth_ex <- growth_ex %>%
  mutate(foldID = SpatialBlock_ex$foldID %>% factor())
```

In the visualization below, the spatial blocks that act as our cross-validation folds are delineated in orange. The shape of each point indicates which fold each tree has been assigned to. 

```{r}
base_plot + 
  geom_sf(data = growth_ex, aes(col = buffer, shape = foldID), size = 2) +
  geom_sf(data = blocks_ex, fill = "transparent", col = "orange")
```

### Compute focal versus competitor tree information

Based on our `growth` data frame, we now explicitly define all "focal" trees and their respective "competitor" trees in a `focal_vs_comp` data frame. This data frame has rows corresponding to each focal tree, and all information about its competitors are saved in the list-column variable `comp`. We implemented this nested format using `nest()` in order to minimize redundancy, given that the same tree can act as a competitor multiple times. 

```{r}
focal_vs_comp_ex <- growth_ex %>%
  create_focal_vs_comp(comp_dist, blocks = blocks_ex, id = "ID", comp_x_var = "basal_area")
focal_vs_comp_ex
```

Using `unnest()` we can fully expand the competitor information saved in the `focal_vs_comp` data frame. For example, the tree with `focal_ID` equal to 2 located at (1.5, 2.5) has two competitors within `comp_dist` distance from it.  

```{r}
focal_vs_comp_ex %>% 
  unnest(cols = "comp")
```

### Fit model and make predictions

We then fit our competitor growth model as specified in Allen and Kim (2020).

```{r}
comp_bayes_lm_ex <- focal_vs_comp_ex %>%
  comp_bayes_lm(prior_param = NULL)
```

The resulting output is an `comp_bayes_lm` object containing the posterior distribution of all linear regression parameters, the intercept, the slope for dbh for each species, and a matrix of all species pairs competitive effects on growth. The S3 object class is associated with several methods.

```{r}
# Print
comp_bayes_lm_ex

# Posterior distributions (plots combined with patchwork pkg)
p1 <- autoplot(comp_bayes_lm_ex, type = "intercepts")
p2 <- autoplot(comp_bayes_lm_ex, type = "dbh_slopes")
p3 <- autoplot(comp_bayes_lm_ex, type = "competition")
(p1 | p2) / p3
```

Furthermore, we can apply a `predict()` method to the resulting `comp_bayes_lm` object to obtain fitted/predicted values of this model. We append these `growth_hat` values to our `focal_vs_comp` data frame.

```{r}
focal_vs_comp_ex <- focal_vs_comp_ex %>%
  mutate(growth_hat = predict(comp_bayes_lm_ex, newdata = focal_vs_comp_ex))
focal_vs_comp_ex
```

We then compute the root mean squared error (RMSE) of the observed versus fitted growths as a measure of our model's fit.

```{r}
focal_vs_comp_ex %>%
  rmse(truth = growth, estimate = growth_hat) %>%
  pull(.estimate)
```

### Run spatial cross-validation

Whereas in our example above we fit our model to the entirety of the data and then generate fitted/predicted growths on this same data, we now apply the same model with spatial cross-validation. All the trees in a given fold will be given a turn as the "test" data while the trees in all remaining folds will be the "training" data. We then fit the model to the training data, but compute fitted/predicted growths for the separate and independent data. 

```{r}
focal_vs_comp_ex <- focal_vs_comp_ex %>%
  run_cv(comp_dist = comp_dist, blocks = blocks_ex)
```

Note the increase in RMSE, reflecting the fact that our original estimate of model error was overly optimistic as it did not account for spatial autocorrelation.

```{r}
focal_vs_comp_ex %>%
  rmse(truth = growth, estimate = growth_hat) %>%
  pull(.estimate)
```

Owner

Name: Albert Y. Kim
Login: rudeboybert
Kind: user
Location: Northampton MA, USA
Company: @SmithCollege-SDS

Website: http://rudeboybert.rbind.io/
Twitter: rudeboybert
Repositories: 39
Profile: https://github.com/rudeboybert

Assistant Professor of Statistical & Data Sciences, Smith College

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Kim
    given-names: Albert Y.
    orcid: https://orcid.org/0000-0001-7824-306X
  - family-names: Allen
    given-names: David N.
    orcid: https://orcid.org/0000-0002-0712-9603
  - family-names: Couch
    given-names: Simon P.
    orcid: https://orcid.org/0000-0001-5676-5107
title: "The forestecology R package for fitting and assessing neighborhood models of the effect of interspecific competition on the growth of trees"
version: 0.2.0
doi: 10.1002/ece3.8129
date-released: 2021-10-01

GitHub Events

Total

Release event: 1
Watch event: 2
Push event: 1
Create event: 1

Last Year

Release event: 1
Watch event: 2
Push event: 1
Create event: 1

Committers

Last synced: about 1 year ago

All Time

Total Commits: 480
Total Committers: 3
Avg Commits per committer: 160.0
Development Distribution Score (DDS): 0.244

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
rudeboybert	a**m@g**m	363
Allen	d**n@m**u	90
Simon P. Couch	s**h@g**m	27

Committer Domains (Top 20 + Academic)

middlebury.edu: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 34
Total pull requests: 58
Average time to close issues: 3 months
Average time to close pull requests: 12 days
Total issue authors: 4
Total pull request authors: 3
Average comments per issue: 1.21
Average comments per pull request: 0.4
Merged pull requests: 55
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

dallenmidd (16)
rudeboybert (15)
pvgansbe (1)
PietroCiccale (1)

Pull Request Authors

rudeboybert (31)
dallenmidd (17)
simonpcouch (10)

Top Labels

Issue Labels

enhancement (1)

Pull Request Labels

Dependencies

DESCRIPTION cran

R >= 3.2.4 depends
blockCV * imports
dplyr * imports
forcats * imports
ggplot2 * imports
ggridges * imports
glue * imports
magrittr * imports
mvnfast * imports
patchwork * imports
purrr * imports
rlang * imports
sf * imports
sfheaders * imports
snakecase * imports
stringr * imports
tibble * imports
tidyr * imports
yardstick * imports
covr * suggests
knitr * suggests
rmarkdown * suggests
testthat >= 3.0.0 suggests

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

forestecology

Science Score: 77.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.Rmd

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies