matrixCorr

Scalable computation of correlation matrices using optimized C++ routines

https://github.com/prof-thiagooliveira/matrixcorr

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.9%) to scientific vocabulary

Keywords

correlation correlation-analysis correlation-coefficient cpp
Last synced: 6 months ago · JSON representation

Repository

Scalable computation of correlation matrices using optimized C++ routines

Basic Info
  • Host: GitHub
  • Owner: Prof-ThiagoOliveira
  • License: other
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 626 KB
Statistics
  • Stars: 1
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Topics
correlation correlation-analysis correlation-coefficient cpp
Created 6 months ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Codemeta

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# matrixCorr


[![R-CMD-check.yaml](https://github.com/Prof-ThiagoOliveira/matrixCorr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/Prof-ThiagoOliveira/matrixCorr/actions/workflows/R-CMD-check.yaml)
[![test-coverage.yaml](https://github.com/Prof-ThiagoOliveira/matrixCorr/actions/workflows/test-coverage.yaml/badge.svg)](https://github.com/Prof-ThiagoOliveira/matrixCorr/actions/workflows/test-coverage.yaml)


`matrixCorr` computes correlation and related association matrices from small to
high-dimensional data using simple, consistent functions and sensible defaults.
It includes shrinkage and robust options for noisy or **p ≥ n** settings, plus
convenient print/plot methods. Performance-critical paths are implemented in
C++ with BLAS/OpenMP and memory-aware symmetric updates. The API accepts base
matrices and data frames and returns standard R objects via a consistent S3
interface.

Supported measures include Pearson, Spearman, Kendall, distance correlation,
partial correlation, and robust biweight mid-correlation; agreement tools cover
Bland–Altman (two-method and repeated-measures) and Lin’s concordance
correlation coefficient (including repeated-measures LMM/REML extensions).

## Features

- High-performance C++ backend using `Rcpp`
- General correlations such as `pearson_corr()`, `spearman_rho()`, `kendall_tau()`
- Robust correlation metrics (`biweight_mid_corr()`)
- Distance correlation (`distance_corr()`)
- Partial correlation (`partial_correlation()`)
- Shrinkage for $p >> n$ (`schafer_corr()`)
- Agreement metrics
    * Bland–Altman (two-method `bland_altman()` and repeated-measures `bland_altman_repeated()`), 
    * Lin’s concordance correlation coefficient (pairwise `ccc()`, repeated-measures LMM/REML `ccc_lmm_reml()` and non-parametric `ccc_pairwise_u_stat()`)

## Installation

```r
# Install from GitHub
# install.packages("devtools")
devtools::install_github("Prof-ThiagoOliveira/matrixCorr")
```

## Example

### Correlation matrices (Pearson, Spearman, Kendall)

```r
library(matrixCorr)

set.seed(1)
X <- as.data.frame(matrix(rnorm(300 * 6), ncol = 6))
names(X) <- paste0("V", 1:6)

R_pear <- pearson_corr(X)
R_spr  <- spearman_rho(X)
R_ken  <- kendall_tau(X)

print(R_pear, digits = 2)
plot(R_spr)   # heatmap
```

### Robust correlation (biweight mid-correlation)

```r
set.seed(2)
Y <- X
# inject outliers
Y$V1[sample.int(nrow(Y), 8)] <- Y$V1[sample.int(nrow(Y), 8)] + 8

R_bicor <- biweight_mid_corr(Y)
print(R_bicor, digits = 2)
```

### High-dimensional shrinkage correlation ($p >> n$)

```r
set.seed(3)
n <- 60; p <- 200
Xd <- matrix(rnorm(n * p), n, p)
colnames(Xd) <- paste0("G", seq_len(p))

R_shr <- schafer_corr(Xd)
print(R_shr, digits = 2, max_rows = 6, max_cols = 6)
```

### Partial correlation matrix

```r
R_part <- partial_correlation(X)
print(R_part, digits = 2)
```

### Distance correlation matrix

```r
R_dcor <- distance_corr(X)
print(R_dcor, digits = 2)
```

## Agreement analyses

### Two-method Bland–Altman

```r
set.seed(4)
x <- rnorm(120, 100, 10)
y <- x + 0.5 + rnorm(120, 0, 8)

ba <- bland_altman(x, y)
print(ba)
plot(ba)
```

### Repeated-measures Bland–Altman (pairwise matrix)

```r
set.seed(5)
S <- 20; Tm <- 6
subj  <- rep(seq_len(S), each = Tm)
time  <- rep(seq_len(Tm), times = S)

true  <- rnorm(S, 50, 6)[subj] + (time - mean(time)) * 0.4
mA    <- true + rnorm(length(true), 0, 2)
mB    <- true + 1.0 + rnorm(length(true), 0, 2.2)
mC    <- 0.95 * true + rnorm(length(true), 0, 2.5)

dat <- rbind(
  data.frame(y = mA, subject = subj, method = "A", time = time),
  data.frame(y = mB, subject = subj, method = "B", time = time),
  data.frame(y = mC, subject = subj, method = "C", time = time)
)
dat$method <- factor(dat$method, levels = c("A","B","C"))

ba_rep <- bland_altman_repeated(
  data = dat, response = "y", subject = "subject",
  method = "method", time = "time",
  include_slope = FALSE, use_ar1 = FALSE
)
summary(ba_rep)
# plot(ba_rep)  # faceted BA scatter by pair
```

### Two-method Lin's concordance correlation

```
# Lin's CCC for x vs y (with CI + heatmap)
cc2 <- ccc(cbind(x = x, y = y), ci = TRUE)
print(cc2)
summary(cc2)
plot(cc2, title = "Lin's CCC (two methods)")
```

### Lin’s concordance correlation coefficient (repeated-measures LMM/REML)

```r
set.seed(6)
S <- 30; Tm <- 8
id     <- factor(rep(seq_len(S), each = 2 * Tm))
method <- factor(rep(rep(c("A","B"), each = Tm), times = S))
time   <- rep(rep(seq_len(Tm), times = 2), times = S)

u  <- rnorm(S, 0, 0.8)[as.integer(id)]
g  <- rnorm(S * Tm, 0, 0.5)
g  <- g[ (as.integer(id) - 1L) * Tm + as.integer(time) ]
y  <- (method == "B") * 0.3 + u + g + rnorm(length(id), 0, 0.7)

dat_ccc <- data.frame(y, id, method, time)

# Using non-parametric approch
ccc_rep_u <- ccc_pairwise_u_stat(
  data = dat_ccc, response = "y", method = "method", time = "time",
  ci = TRUE
)
print(ccc_rep_u)
summary(ccc_rep_u)
plot(ccc_rep_u, title = "Repeated-measures CCC (U-statistic)")

# Using LMM approch
fit_ccc <- ccc_lmm_reml(dat_ccc, response = "y", rind = "id",
                        method = "method", time = "time", ci = TRUE)
summary(fit_ccc)  # overall CCC, variance components, SEs/CI
```

## Contributing

Issues and pull requests are welcome. Please see `CONTRIBUTING.md` for
guidelines and `cran-comments.md`/`DESCRIPTION` for package metadata.

## License

MIT [Thiago de Paula Oliveira](https://orcid.org/0000-0002-4555-2584)

See inst/LICENSE for the full MIT license text.

Owner

  • Name: Thiago de Paula Oliveira
  • Login: Prof-ThiagoOliveira
  • Kind: user
  • Location: Edinburgh, Scotland
  • Company: AbacusBio

Dr Thiago de Paula Oliveira is a Researcher Biostatistician at the AbacusBio.

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "identifier": "matrixCorr",
  "description": "Compute correlation and other association matrices from small to high-dimensional datasets with relative simple functions and sensible defaults. Includes options for shrinkage and robustness to improve results in noisy or high-dimensional settings (p >= n), plus convenient print/plot methods for inspection. Implemented with optimised C++ backends using BLAS/OpenMP and memory-aware symmetric updates. Works with base matrices and data frames, returning standard R objects via a consistent S3 interface. Useful across genomics, agriculture, and machine-learning workflows. Supports Pearson, Spearman, Kendall, distance correlation, partial correlation, and robust biweight mid-correlation; BlandAltman analyses and Lin's concordance correlation coefficient (including repeated-measures extensions). Methods based on Ledoit and Wolf (2004) <doi:10.1016/S0047-259X(03)00096-4>; Schfer and Strimmer (2005) <doi:10.2202/1544-6115.1175>; Lin (1989) <doi:10.2307/2532051>.",
  "name": "matrixCorr: Collection of Correlation and Association Estimators",
  "codeRepository": "https://github.com/Prof-ThiagoOliveira/matrixCorr",
  "issueTracker": "https://github.com/Prof-ThiagoOliveira/matrixCorr/issues",
  "license": "https://spdx.org/licenses/MIT",
  "version": "0.4.3",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 4.4.3 (2025-02-28 ucrt)",
  "provider": {
    "@id": "https://cran.r-project.org",
    "@type": "Organization",
    "name": "Comprehensive R Archive Network (CRAN)",
    "url": "https://cran.r-project.org"
  },
  "author": [
    {
      "@type": "Person",
      "givenName": "Thiago de Paula",
      "familyName": "Oliveira",
      "email": "thiago.paula.oliveira@gmail.com",
      "@id": "https://orcid.org/0000-0002-4555-2584"
    }
  ],
  "maintainer": [
    {
      "@type": "Person",
      "givenName": "Thiago de Paula",
      "familyName": "Oliveira",
      "email": "thiago.paula.oliveira@gmail.com",
      "@id": "https://orcid.org/0000-0002-4555-2584"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "knitr",
      "name": "knitr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=knitr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rmarkdown",
      "name": "rmarkdown",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rmarkdown"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "MASS",
      "name": "MASS",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=MASS"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "viridisLite",
      "name": "viridisLite",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=viridisLite"
    }
  ],
  "softwareRequirements": {
    "1": {
      "@type": "SoftwareApplication",
      "identifier": "Rcpp",
      "name": "Rcpp",
      "version": ">= 1.1.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=Rcpp"
    },
    "2": {
      "@type": "SoftwareApplication",
      "identifier": "ggplot2",
      "name": "ggplot2",
      "version": ">= 3.5.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=ggplot2"
    },
    "3": {
      "@type": "SoftwareApplication",
      "identifier": "Matrix",
      "name": "Matrix",
      "version": ">= 1.7.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=Matrix"
    },
    "SystemRequirements": null
  },
  "fileSize": "55696.68KB",
  "readme": "https://github.com/Prof-ThiagoOliveira/matrixCorr/blob/main/README.md",
  "contIntegration": [
    "https://github.com/Prof-ThiagoOliveira/matrixCorr/actions/workflows/R-CMD-check.yaml",
    "https://app.codecov.io/gh/Prof-ThiagoOliveira/kendall_tau_rank_corr"
  ],
  "keywords": [
    "correlation",
    "correlation-analysis",
    "correlation-coefficient",
    "cpp"
  ]
}

GitHub Events

Total
  • Push event: 55
Last Year
  • Push event: 55

Packages

  • Total packages: 1
  • Total downloads:
    • cran 58 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
cran.r-project.org: matrixCorr

Collection of Correlation and Association Estimators

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 58 Last month
Rankings
Dependent packages count: 25.7%
Dependent repos count: 31.5%
Average: 47.5%
Downloads: 85.4%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v4 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v4 composite
  • codecov/codecov-action v5 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • Matrix >= 1.7.2 imports
  • Rcpp >= 1.1.0 imports
  • ggplot2 >= 3.5.2 imports
  • MASS * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • testthat * suggests
  • viridisLite * suggests