Methods and Algorithms for Correlation Analysis in R

Methods and Algorithms for Correlation Analysis in R - Published in JOSS (2020)

https://github.com/easystats/correlation

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: frontiersin.org, joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

bayesian bayesian-correlations biserial cor correlation correlation-analysis correlations easystats gamma gaussian-graphical-models hacktoberfest matrix multilevel-correlations outliers partial partial-correlations r regression robust spearman

Keywords from Contributors

standardization predict bayes-factors bayesfactor bayesian-framework r2 mixed-models loo aic credible-interval

Scientific Fields

Engineering Computer Science - 60% confidence
Last synced: 4 months ago · JSON representation

Repository

:link: Methods for Correlation Analysis

Basic Info
Statistics
  • Stars: 442
  • Watchers: 15
  • Forks: 57
  • Open Issues: 64
  • Releases: 16
Topics
bayesian bayesian-correlations biserial cor correlation correlation-analysis correlations easystats gamma gaussian-graphical-models hacktoberfest matrix multilevel-correlations outliers partial partial-correlations r regression robust spearman
Created almost 7 years ago · Last pushed 5 months ago
Metadata Files
Readme Changelog Contributing Funding License Code of conduct Support

README.Rmd

---
output: github_document
---

# correlation 

```{r README-1, warning=FALSE, message=FALSE, echo=FALSE}
library(ggplot2)
library(poorman)
library(correlation)

options(digits = 2)

knitr::opts_chunk$set(
  collapse = TRUE,
  dpi = 300,
  message = FALSE,
  warning = FALSE,
  fig.path = "man/figures/"
)
```

[![DOI](https://joss.theoj.org/papers/10.21105/joss.02306/status.svg)](https://doi.org/10.21105/joss.02306)  [![downloads](https://cranlogs.r-pkg.org/badges/correlation)](https://cran.r-project.org/package=correlation)
[![total](https://cranlogs.r-pkg.org/badges/grand-total/correlation)](https://cranlogs.r-pkg.org/)

`correlation` is an [**easystats**](https://github.com/easystats/easystats) package focused on correlation analysis. It's lightweight, easy to use, and allows for the computation of many different kinds of correlations, such as **partial** correlations, **Bayesian** correlations, **multilevel** correlations, **polychoric** correlations, **biweight**, **percentage bend** or **Sheperd's Pi** correlations (types of robust correlation), **distance** correlation (a type of non-linear correlation) and more, also allowing for combinations between them (for instance, *Bayesian partial multilevel correlation*).

# Citation

You can cite the package as follows:

Makowski, D., Ben-Shachar, M. S., Patil, I., \& Lüdecke, D. (2020). Methods and algorithms for correlation analysis in R. _Journal of Open Source Software_,
*5*(51), 2306. https://doi.org/10.21105/joss.02306

Makowski, D., Wiernik, B. M., Patil, I., Lüdecke, D., \& Ben-Shachar, M. S. (2022). *correlation*: Methods for correlation analysis [R package]. https://CRAN.R-project.org/package=correlation (Original work published 2020)

# Installation

[![CRAN](https://www.r-pkg.org/badges/version/correlation)](https://cran.r-project.org/package=correlation) [![correlation status badge](https://easystats.r-universe.dev/badges/correlation)](https://easystats.r-universe.dev) [![codecov](https://codecov.io/gh/easystats/correlation/branch/main/graph/badge.svg)](https://app.codecov.io/gh/easystats/correlation)

The *correlation* package is available on CRAN, while its latest development version is available on R-universe (from _rOpenSci_).

Type | Source | Command
---|---|---
Release | CRAN | `install.packages("correlation")`
Development | R-universe | `install.packages("correlation", repos = "https://easystats.r-universe.dev")`

Once you have downloaded the package, you can then load it using:

```{r, eval=FALSE}
library("correlation")
```

> **Tip**
>
> Instead of `library(bayestestR)`, use `library(easystats)`.
> This will make all features of the  easystats-ecosystem available.
>
> To stay updated, use `easystats::install_latest()`.

# Documentation

[![Documentation](https://img.shields.io/badge/documentation-correlation-orange.svg?colorB=E91E63)](https://easystats.github.io/correlation/)
[![Blog](https://img.shields.io/badge/blog-easystats-orange.svg?colorB=FF9800)](https://easystats.github.io/blog/posts/)
[![Features](https://img.shields.io/badge/features-correlation-orange.svg?colorB=2196F3)](https://easystats.github.io/correlation/reference/index.html)


Check out package [website](https://easystats.github.io/correlation/) for documentation.

# Features

The *correlation* package can compute many different types of correlation,
including:

✅ **Pearson's correlation**
✅ **Spearman's rank correlation**
✅ **Kendall's rank correlation**
✅ **Biweight midcorrelation**
✅ **Distance correlation**
✅ **Percentage bend correlation**
✅ **Shepherd's Pi correlation**
✅ **Blomqvist’s coefficient**
✅ **Hoeffding’s D**
✅ **Gamma correlation**
✅ **Gaussian rank correlation**
✅ **Point-Biserial and biserial correlation**
✅ **Winsorized correlation**
✅ **Polychoric correlation**
✅ **Tetrachoric correlation**
✅ **Multilevel correlation**
An overview and description of these correlations types is [**available here**](https://easystats.github.io/correlation/articles/types.html). Moreover, many of these correlation types are available as **partial** or within a **Bayesian** framework. # Examples The main function is [`correlation()`](https://easystats.github.io/correlation/reference/correlation.html), which builds on top of [`cor_test()`](https://easystats.github.io/correlation/reference/cor_test.html) and comes with a number of possible options. ## Correlation details and matrix ```{r README-4} results <- correlation(iris) results ``` The output is not a square matrix, but a **(tidy) dataframe with all correlations tests per row**. One can also obtain a **matrix** using: ```{r README-5} summary(results) ``` Note that one can also obtain the full, **square** and redundant matrix using: ```{r README-6} summary(results, redundant = TRUE) ``` ```{r README-7} library(see) results %>% summary(redundant = TRUE) %>% plot() ``` ## Correlation tests The `cor_test()` function, for pairwise correlations, is also very convenient for making quick scatter plots. ```{r README-corr} plot(cor_test(iris, "Sepal.Width", "Sepal.Length")) ``` ## Grouped dataframes The `correlation()` function also supports **stratified correlations**, all within the *tidyverse* workflow! ```{r README-8} iris %>% select(Species, Sepal.Length, Sepal.Width, Petal.Width) %>% group_by(Species) %>% correlation() ``` ## Bayesian Correlations It is very easy to switch to a **Bayesian framework**. ```{r README-9} correlation(iris, bayesian = TRUE) ``` ## Tetrachoric, Polychoric, Biserial, Biweight... The `correlation` package also supports different types of methods, which can deal with correlations **between factors**! ```{r README-10} correlation(iris, include_factors = TRUE, method = "auto") ``` ## Partial Correlations It also supports **partial correlations** (as well as Bayesian partial correlations). ```{r README-11} iris %>% correlation(partial = TRUE) %>% summary() ``` ## Gaussian Graphical Models (GGMs) Such partial correlations can also be represented as **Gaussian Graphical Models** (GGM), an increasingly popular tool in psychology. A GGM traditionally include a set of variables depicted as circles ("nodes"), and a set of lines that visualize relationships between them, which thickness represents the strength of association (see [Bhushan et al., 2019](https://www.frontiersin.org/articles/10.3389/fpsyg.2019.01050/full)). ```{r README-12} library(see) # for plotting library(ggraph) # needs to be loaded plot(correlation(mtcars, partial = TRUE)) + scale_edge_color_continuous(low = "#000004FF", high = "#FCFDBFFF") ``` ## Multilevel Correlations It also provide some cutting-edge methods, such as Multilevel (partial) correlations. These are are partial correlations based on linear mixed-effects models that include the factors as **random effects**. They can be see as correlations *adjusted* for some group (*hierarchical*) variability. ```{r README-13} iris %>% correlation(partial = TRUE, multilevel = TRUE) %>% summary() ``` However, if the `partial` argument is set to `FALSE`, it will try to convert the partial coefficient into regular ones.These can be **converted back** to full correlations: ```{r README-14} iris %>% correlation(partial = FALSE, multilevel = TRUE) %>% summary() ``` # Contributing and Support In case you want to file an issue or contribute in another way to the package, please follow [this guide](https://easystats.github.io/correlation/CONTRIBUTING.html). For questions about the functionality, you may either contact us via email or also file an issue. # Code of Conduct Please note that this project is released with a [Contributor Code of Conduct](https://easystats.github.io/correlation/CODE_OF_CONDUCT.html). By participating in this project you agree to abide by its terms.

Owner

  • Name: easystats
  • Login: easystats
  • Kind: organization
  • Location: worldwide

Make R stats easy!

JOSS Publication

Methods and Algorithms for Correlation Analysis in R
Published
July 16, 2020
Volume 5, Issue 51, Page 2306
Authors
Dominique Makowski ORCID
Nanyang Technological University, Singapore
Mattan S. Ben-Shachar ORCID
Ben-Gurion University of the Negev, Israel
Indrajeet Patil ORCID
Max Planck Institute for Human Development, Germany
Daniel Lüdecke ORCID
University Medical Center Hamburg-Eppendorf, Germany
Editor
Mikkel Meyer Andersen ORCID
Tags
Correlation Easystats

Papers & Mentions

Total mentions: 5

miRTissue ce: extending miRTissue web service with the analysis of ceRNA-ceRNA interactions
Last synced: 2 months ago
Support Values for Genome Phylogenies
Last synced: 2 months ago
Bacillus amyloliquefaciens, Bacillus velezensis, and Bacillus siamensis Form an “Operational Group B. amyloliquefaciens” within the B. subtilis Species Complex
Last synced: 2 months ago
Molecular characterization of mitochondrial Amerindian haplogroups and the amelogenin gene in human ancient DNA from three archaeological sites in Lambayeque - Peru
Last synced: 2 months ago
Dynamics of chromatin accessibility and gene regulation by MADS-domain transcription factors in flower development
Last synced: 2 months ago

GitHub Events

Total
  • Create event: 15
  • Issues event: 10
  • Release event: 2
  • Watch event: 11
  • Delete event: 14
  • Issue comment event: 50
  • Push event: 93
  • Pull request review comment event: 9
  • Pull request review event: 16
  • Pull request event: 31
  • Fork event: 1
Last Year
  • Create event: 15
  • Issues event: 10
  • Release event: 2
  • Watch event: 11
  • Delete event: 14
  • Issue comment event: 50
  • Push event: 93
  • Pull request review comment event: 9
  • Pull request review event: 16
  • Pull request event: 31
  • Fork event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 690
  • Total Committers: 11
  • Avg Commits per committer: 62.727
  • Development Distribution Score (DDS): 0.667
Past Year
  • Commits: 66
  • Committers: 8
  • Avg Commits per committer: 8.25
  • Development Distribution Score (DDS): 0.379
Top Committers
Name Email Commits
DominiqueMakowski d****9@g****m 230
Indrajeet Patil p****e@g****m 206
Daniel m****l@d****e 184
Brenton M. Wiernik b****k 38
mattansb 3****b 11
Etienne Bacher 5****r 8
github-actions[bot] 4****] 7
olivroy 5****y 2
Rémi Thériault 1****c 2
houyun h****g@1****m 1
Jan Marvin Garbuszus j****s@r****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 90
  • Total pull requests: 108
  • Average time to close issues: 5 months
  • Average time to close pull requests: 16 days
  • Total issue authors: 39
  • Total pull request authors: 12
  • Average comments per issue: 2.48
  • Average comments per pull request: 1.23
  • Merged pull requests: 69
  • Bot issues: 0
  • Bot pull requests: 44
Past Year
  • Issues: 8
  • Pull requests: 41
  • Average time to close issues: 2 months
  • Average time to close pull requests: 15 days
  • Issue authors: 7
  • Pull request authors: 6
  • Average comments per issue: 0.75
  • Average comments per pull request: 1.12
  • Merged pull requests: 22
  • Bot issues: 0
  • Bot pull requests: 21
Top Authors
Issue Authors
  • IndrajeetPatil (25)
  • mattansb (11)
  • bwiernik (8)
  • DominiqueMakowski (4)
  • strengejacke (3)
  • andreifoldes (2)
  • shirdekel (2)
  • jromanowska (2)
  • lammertvos (2)
  • TarandeepKang (2)
  • friendly (1)
  • jschoeneberger (1)
  • emstruong (1)
  • rebekkasl (1)
  • araikes (1)
Pull Request Authors
  • github-actions[bot] (44)
  • strengejacke (17)
  • IndrajeetPatil (13)
  • bwiernik (9)
  • etiennebacher (8)
  • DominiqueMakowski (6)
  • olivroy (4)
  • mattansb (3)
  • JanMarvin (2)
  • vincentarelbundock (1)
  • jmgirard (1)
  • rempsyc (1)
Top Labels
Issue Labels
enhancement :boom: (19) bug :bug: (13) docs :books: (10) high priority :running_man: (5) feature idea :fire: (4) consistency :green_apple: :apple: (3) low priority :sleeping: (3) what's your opinion :revolving_hearts: (3) 3 investigators :grey_question::question: (2) question :question: (2) invalid :x: (1) wontfix :no_entry_sign: (1) duplicate :two_men_holding_hands: (1)
Pull Request Labels
auto-update (44) hacktoberfest-accepted (3) feature idea :fire: (1)

Packages

  • Total packages: 3
  • Total downloads:
    • cran 21,191 last-month
  • Total docker downloads: 3,052
  • Total dependent packages: 14
    (may contain duplicates)
  • Total dependent repositories: 25
    (may contain duplicates)
  • Total versions: 45
  • Total maintainers: 1
cran.r-project.org: correlation

Methods for Correlation Analysis

  • Versions: 19
  • Dependent Packages: 12
  • Dependent Repositories: 25
  • Downloads: 21,191 Last month
  • Docker Downloads: 3,052
Rankings
Stargazers count: 1.0%
Forks count: 1.4%
Downloads: 3.3%
Average: 4.0%
Dependent repos count: 5.5%
Dependent packages count: 5.6%
Docker downloads count: 7.4%
Maintainers (1)
Last synced: 4 months ago
proxy.golang.org: github.com/easystats/correlation
  • Versions: 13
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.5%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 4 months ago
conda-forge.org: r-correlation
  • Versions: 13
  • Dependent Packages: 2
  • Dependent Repositories: 0
Rankings
Stargazers count: 18.5%
Dependent packages count: 19.5%
Forks count: 23.1%
Average: 23.8%
Dependent repos count: 34.0%
Last synced: 4 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.6 depends
  • bayestestR >= 0.13.0 imports
  • datasets * imports
  • datawizard >= 0.6.5 imports
  • insight >= 0.19.0 imports
  • parameters >= 0.20.2 imports
  • stats * imports
  • BayesFactor * suggests
  • Hmisc * suggests
  • WRS2 * suggests
  • energy * suggests
  • ggplot2 * suggests
  • ggraph * suggests
  • gt * suggests
  • knitr * suggests
  • lme4 * suggests
  • mbend * suggests
  • polycor * suggests
  • poorman * suggests
  • ppcor * suggests
  • psych * suggests
  • rmarkdown * suggests
  • rmcorr * suggests
  • see * suggests
  • testthat >= 3.1.0 suggests
  • tidygraph * suggests
  • wdm * suggests