rr2

rr2: An R package to calculate $R^2$s for regression models - Published in JOSS (2018)

https://github.com/arives/rr2

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 10 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    1 of 3 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 40% confidence
Last synced: 4 months ago · JSON representation

Repository

R2 for correlated data; see https://academic.oup.com/sysbio/advance-article/doi/10.1093/sysbio/syy060/5098616

Basic Info
  • Host: GitHub
  • Owner: arives
  • License: gpl-3.0
  • Language: R
  • Default Branch: master
  • Homepage:
  • Size: 1.69 MB
Statistics
  • Stars: 18
  • Watchers: 3
  • Forks: 7
  • Open Issues: 2
  • Releases: 1
Created over 8 years ago · Last pushed over 2 years ago
Metadata Files
Readme Changelog License

README.Rmd

---
output:
  md_document:
    variant: markdown_github
---



```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

[![DOI](https://joss.theoj.org/papers/10.21105/joss.01028/status.svg/)](https://doi.org/10.21105/joss.01028) [![CRAN status](https://www.r-pkg.org/badges/version/rr2)](https://cran.r-project.org/package=rr2/) [![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/rr2)](https://www.r-pkg.org/pkg/rr2) [![CRAN RStudio mirror
downloads](https://cranlogs.r-pkg.org/badges/grand-total/rr2?color=green)](https://www.r-pkg.org/pkg/rr2)

# Goal

This package provides three R^2^s for statistical models with correlated errors including classes: 'lmerMod' (LMM), 'glmerMod' (GLMM), 'phylolm' (Phylogenetic GLS), and 'binaryPGLMM/phyloglm/communityPGLMM' (Phylogenetic Logistic Regression). Detailed technical descriptions can be found in [Ives 2018](https://doi.org/10.1093/sysbio/syy060).

# Installation

This package can be installed with:

```{r, eval=FALSE}
install.packages("rr2")

# or install the latest version
# install.packages("devtools")
devtools::install_github("arives/rr2")
```

# Package structure

This package has three main functions: `R2.resid()`, `R2.lik()`, and `R2.pred()`. You can use them individually in the form of, e.g., `R2.resid(mod, mod.r)` where `mod` is the full model and `mod.r` is the reduced model for partial R2s. If you do not include the reduced model `mod.r`, then the appropriate model with just the intercept is used to give the total R^2^. When using `R2.resid` and `R2.pred` with PGLS, you need to include the phylo object containing a phylogenetic tree, e.g., `R2.resid(mod, mod.r, phy = phy)`. 

You can calculate all three R^2^s at the same time with `R2(mod, mod.r)`. You can also specify which R^2^(s) to calculate within this function by turning off unwanted methods, e.g., `R2(mod, mod.r, resid = FALSE)` or `R2(mod, mod.r, pred = FALSE)`. 

This package also has some helper functions such as `inv.logit()`, `partialR2()`, and `partialR2adj()`.

| Models                           | Available.R2s              |
| :------------------------------- | :------------------------- |
| LM                               | partialR2, partialR2adj    |
| LM                               | R2.pred, R2.resid, R2.lik  |
| GLM                              | R2.pred, R2.resid, R2.lik  |
| LMM:  lmerMod                    | R2.pred, R2.resid, R2.lik  |
| GLMM: glmerMod                   | R2.pred, R2.resid, R2.lik  |
| PGLS: phylolm                    | R2.pred, R2.resid, R2.lik  |
| PGLMM: binaryPGLMM               | R2.pred, R2.resid, ------- |
| PGLMM: phyloglm                  | -------, --------, R2.lik  |
| PGLMM: communityPGLMM (gaussian) | R2.pred, --------, R2.lik  |
| PGLMM: communityPGLMM (binomial) | R2.pred, --------, ------- |

# Usage: calculating R^2^s for regression models

First, let's simulate data that will be used to fit various models.

```{r data-sim}
# data 
set.seed(123)
p1 <- 10; nsample <- 10; n <- p1 * nsample
d <- data.frame(x1 = rnorm(n = n), 
                x2 = rnorm(n = n), 
                u1 = rep(1:p1, each = nsample), 
                u2 = rep(1:p1, times = nsample))
d$u1 <- as.factor(d$u1); d$u2 <- as.factor(d$u2)

# LMM: y with random intercept
b1 <- 1; b2 <- -1; sd1 <- 1.5
d$y_re_intercept <- b1 * d$x1 + b2 * d$x2 + 
  rep(rnorm(n = p1, sd = sd1), each = nsample) +  # random intercept u1
  rep(rnorm(n = p1, sd = sd1), times = nsample) + # random intercept u2
  rnorm(n = n)

# LMM: y with random slope
b1 <- 0; sd1 <- 1; sd.x1 <- 2
d$y_re_slope <- b1 * d$x1 + 
  rep(rnorm(n = p1, sd = sd1), each = nsample) + # random intercept u1
  d$x1 * rep(rnorm(n = p1, sd = sd.x1), times = nsample) + # random slope u1
  rnorm(n = n)

# GLMM
b1 <- 1; sd1 <- 1.5
prob <- rr2::inv.logit(b1 * d$x1 + rep(rnorm(n = p1, sd = sd1), each = nsample)) 
# random intercept u1
d$y_binary <- rbinom(n = n, size = 1, prob = prob)

# PGLS
b1 <- 1.5; signal <- 0.7
phy <- ape::compute.brlen(ape::rtree(n = n), method = "Grafen", power = 1)
phy.x <- ape::compute.brlen(phy, method = "Grafen", power = .0001)
x_trait <- ape::rTraitCont(phy.x, model = "BM", sigma = 1)
e <- signal^0.5 * ape::rTraitCont(phy, model = "BM", sigma = 1) + (1-signal)^0.5 * rnorm(n=n)
d$x_trait <- x_trait[match(names(e), names(x_trait))]
d$y_pgls <- b1 * x_trait + e
rownames(d) <- phy$tip.label	

# Phylogenetic Logistic Regression
b1 <- 1.5; signal <- 2
e <- signal * ape::rTraitCont(phy, model = "BM", sigma = 1)
e <- e[match(phy$tip.label, names(e))]
d$y_phy_binary <- rbinom(n = n, size = 1, prob = rr2::inv.logit(b1 * d$x1 + e))

head(d)
```

Then, let's fit some models and calculate their R^2^s.

## LM

```{r lm}
library(rr2)
z.f.lm <- lm(y_re_intercept ~ x1 + x2, data = d)
z.x.lm <- lm(y_re_intercept ~ x1, data = d)
z.0.lm <- lm(y_re_intercept ~ 1, data = d)

R2(mod = z.f.lm, mod.r = z.x.lm)
partialR2(mod = z.f.lm, mod.r = z.x.lm)
partialR2adj(mod = z.f.lm, mod.r = z.x.lm)
```


## LMM

```{r lmm}
z.f.lmm <- lme4::lmer(y_re_intercept ~ x1 + x2 + (1 | u1) + (1 | u2), data = d, REML = F)
z.x.lmm <- lme4::lmer(y_re_intercept ~ x1 + (1 | u1) + (1 | u2), data = d, REML = F)
z.v.lmm <- lme4::lmer(y_re_intercept ~ 1 + (1 | u2), data = d, REML = F)
z.0.lmm <- lm(y_re_intercept ~ 1, data = d)

R2(mod = z.f.lmm, mod.r = z.x.lmm)
R2(mod = z.f.lmm, mod.r = z.v.lmm)
R2(mod = z.f.lmm, mod.r = z.0.lmm)
R2(mod = z.f.lmm) # if omit mod.r, default will be the simplest model, such as z.0.lmm here.
```

## GLMM

```{r glmm}
z.f.glmm <- lme4::glmer(y_binary ~ x1 + (1 | u1), data = d, family = "binomial")
z.x.glmm <- lme4::glmer(y_binary ~ 1 + (1 | u1), data = d, family = "binomial")
z.v.glmm <- glm(y_binary ~ x1, data = d, family = "binomial")

R2(mod = z.f.glmm, mod.r = z.x.glmm)
R2(mod = z.f.glmm, mod.r = z.v.glmm)
R2(mod = z.f.glmm)
# specify sigma2_d for R2.resid()
R2.resid(mod = z.f.glmm, mod.r = z.v.glmm, sigma2_d = "s2w")
R2.resid(mod = z.f.glmm, mod.r = z.v.glmm, sigma2_d = "NS") 
R2.resid(mod = z.f.glmm, mod.r = z.v.glmm, sigma2_d = "rNS")
```

## PGLS 

```{r pgls}
z.f.pgls <- phylolm::phylolm(y_pgls ~ x_trait, phy = phy, data = d, model = "lambda")
z.v.lm <- lm(y_pgls ~ x_trait, data = d)

# phy is needed for phylogenetic models' R2.resid and R2.pred
R2(mod = z.f.pgls, mod.r = z.v.lm, phy = phy)
R2(mod = z.f.pgls, phy = phy)

# This also works for models fit with nlme::gls()
z.f.gls <- nlme::gls(y_pgls ~ x_trait, data = d, correlation = ape::corPagel(1, phy), method = "ML")
z.x.gls <- nlme::gls(y_pgls ~ 1, data = d, correlation = ape::corPagel(1, phy), method = "ML")
R2(mod = z.f.gls, mod.r = z.v.lm)
R2(mod = z.f.gls)
```

## Phylogenetic Logistic Regression

*Note*: we modified `ape::binaryPGLMM` to return necessary components for `rr2::R2()`.

```{r plog}
z.f.plog <- rr2::binaryPGLMM(y_phy_binary ~ x1, data = d, phy = phy)
z.x.plog <- rr2::binaryPGLMM(y_phy_binary ~ 1, data = d, phy = phy)
z.v.plog <- glm(y_phy_binary ~ x1, data = d, family = "binomial")

# R2.lik can't be used with binaryPGLMM because it is not a ML method
R2(mod = z.f.plog, mod.r = z.x.plog)
R2(mod = z.f.plog)

z.f.plog2 <- phylolm::phyloglm(y_phy_binary ~ x1, data = d, start.alpha = 1, phy = phy)
z.x.plog2 <- phylolm::phyloglm(y_phy_binary ~ 1, data = d, phy = phy, 
                               start.alpha = min(20, z.f.plog2$alpha))
z.v.plog2 <- glm(y_phy_binary ~ x1, data = d, family = "binomial")

# R2.resid and R2.pred do not apply for phyloglm
R2(z.f.plog2, z.x.plog2) 
# alternate
R2.lik(z.f.plog2, z.x.plog2)
```

# Contributions of predictors

We can use `rr2::R2()` to calculate partial R^2^s and compare contributions of different predictors. Here is an example using `phylolm::phyloglm()`. The same comparisons can be also applied to other types of models.

```{r comparePredictors}
z.f <- phylolm::phyloglm(y_phy_binary ~ x1 + x2, data = d, start.alpha = 1, phy = phy)
z.r1 <- phylolm::phyloglm(y_phy_binary ~ x1, data = d, start.alpha = 1, phy = phy)
z.r2 <- phylolm::phyloglm(y_phy_binary ~ x2, data = d, start.alpha = 1, phy = phy)
# total R2
R2(z.f)
# contribution of x1
R2(z.f, z.r2)
# contribution of x2
R2(z.f, z.r1)
```

It is also possible to estimate the "contribution" of correlation structrues in the model. For the above example, we can replace the phylogeny with a star phylogeny and then compare the R^2^s of the two models.

```{r}
# see the first chunk R code for the build of phy.x, a star phylogeny
z.r3 <- phylolm::phyloglm(y_phy_binary ~ x1 + x2, data = d, start.alpha = 1, phy = phy.x)
R2(z.f, z.r3)
```


# Citation

Please cite the following papers if you find this package useful:

> - [Anthony R. Ives. 2018. R2s for Correlated Data: Phylogenetic Models, LMMs, and GLMMs. Systematic Biology, Volume 68, Issue 2, March 2019, Pages 234-251.](https://doi.org/10.1093/sysbio/syy060)  
> - [Anthony R. Ives and Daijiang Li (2018). rr2: An R package to calculate R^2s for regression models. The
  Journal of Open Source Software, 3(30), 1028.](https://doi.org/10.21105/joss.01028)

# Contributing

Contributions are welcome. You can either provide comments and feedback by filing an issue on Github [here](https://github.com/arives/rr2/issues) or making pull requests. It may be easier if you first open 
an issue outlining what you will do in the pull request.

Questions about the package can also be posted as issues on Github.

Owner

  • Name: Anthony R. Ives
  • Login: arives
  • Kind: user
  • Company: UW-Madison

JOSS Publication

rr2: An R package to calculate $R^2$s for regression models
Published
October 31, 2018
Volume 3, Issue 30, Page 1028
Authors
Anthony R. Ives ORCID
Department of Integrative Biology, UW-Madison, Madison, WI 53706
Daijiang Li ORCID
Department of Wildlife Ecology and Conservation, University of Florida, Gainesville, FL 32611
Editor
Karthik Ram ORCID
Tags
$R^2$ GLMM phylogenetic regression models with correlated data

Papers & Mentions

Total mentions: 4

Rubisco Adaptation Is More Limited by Phylogenetic Constraint Than by Catalytic Trade-off
Last synced: 3 months ago
<tt>partR2</tt>: partitioning R<sup>2</sup> in generalized linear mixed models
Last synced: 3 months ago
Evolution of winter molting strategies in European and North American migratory passerines
Last synced: 3 months ago
Adaptations to light contribute to the ecological niches and evolution of the terrestrial avifauna
Last synced: 3 months ago

GitHub Events

Total
Last Year

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 233
  • Total Committers: 3
  • Avg Commits per committer: 77.667
  • Development Distribution Score (DDS): 0.515
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
arives a****s 113
Daijiang Li d****1@u****u 85
daijiang d****e@g****m 35
Committer Domains (Top 20 + Academic)
ufl.edu: 1

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 16
  • Total pull requests: 2
  • Average time to close issues: 8 months
  • Average time to close pull requests: 12 months
  • Total issue authors: 8
  • Total pull request authors: 2
  • Average comments per issue: 3.5
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • sastoudt (7)
  • arives (2)
  • daijiang (2)
  • Flaiba (1)
  • AEgit (1)
  • atanzap (1)
  • jydu (1)
  • matthieu-haudiquet (1)
Pull Request Authors
  • daijiang (1)
  • jydu (1)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 8,841 last-month
  • Total docker downloads: 363
  • Total dependent packages: 2
  • Total dependent repositories: 5
  • Total versions: 5
  • Total maintainers: 1
cran.r-project.org: rr2

R2s for Regression Models

  • Versions: 5
  • Dependent Packages: 2
  • Dependent Repositories: 5
  • Downloads: 8,841 Last month
  • Docker Downloads: 363
Rankings
Downloads: 7.2%
Forks count: 9.6%
Average: 12.3%
Dependent repos count: 13.0%
Stargazers count: 13.3%
Dependent packages count: 13.7%
Docker downloads count: 16.9%
Maintainers (1)
Last synced: 4 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.0 depends
  • stats * depends
  • Matrix * imports
  • ape * imports
  • lme4 * imports
  • nlme * imports
  • phylolm >= 2.6.2 imports
  • phyr >= 1.0.3 imports
  • utils * imports
  • mvtnorm * suggests
  • testthat * suggests