HDTD

Characterization of intra-individual variability using physiologically relevant measurements provides important insights into fundamental biological questions ranging from cell type identity to tumor development. For each individual, the data measurements can be written as a matrix with the different subsamples of the individual recorded in the columns and the different phenotypic units recorded in the rows. Datasets of this type are called high-dimensional transposable data. The HDTD package provides functions for conducting statistical inference for the mean relationship between the row and column variables and for the covariance structure within and between the row and column variables.

https://github.com/anestistouloumis/hdtd

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: wiley.com
  • Committers with academic emails
    4 of 12 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.6%) to scientific vocabulary

Keywords

bioconductor-package high-dimensional statistics

Keywords from Contributors

tracking particles microscopy fluorescence core-package rhdf5 hdf5 bioconductor bioinformatics metabolomics
Last synced: 6 months ago · JSON representation

Repository

Characterization of intra-individual variability using physiologically relevant measurements provides important insights into fundamental biological questions ranging from cell type identity to tumor development. For each individual, the data measurements can be written as a matrix with the different subsamples of the individual recorded in the columns and the different phenotypic units recorded in the rows. Datasets of this type are called high-dimensional transposable data. The HDTD package provides functions for conducting statistical inference for the mean relationship between the row and column variables and for the covariance structure within and between the row and column variables.

Basic Info
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
bioconductor-package high-dimensional statistics
Created over 8 years ago · Last pushed over 4 years ago
Metadata Files
Readme

README.Rmd

---
output: github_document
references:
- id: Touloumis2016
  title: "HDTD: Analyzing Multi-tissue Gene Expression Data"
  author:
  - family: Touloumis
    given: Anestis
  - family: Marioni 
    given: John C.
  - family: Tavaré
    given: Simon  
  container-title: Bioinformatics
  volume: 32
  URL: 'https://doi.org/10.1093/bioinformatics/btw224'
  issue: 14
  page: 2193--2195
  type: article-journal
  issued:
    year: 2016
- id: Touloumis2015
  title: "Testing the Mean Matrix in High-Dimensional Transposable Data"
  author:
  - family: Touloumis
    given: Anestis
  - family: Tavaré
    given: Simon
  - family: Marioni 
    given: John C. 
  container-title: Biometrics
  volume: 71
  URL: 'http://onlinelibrary.wiley.com/doi/10.1111/biom.12054/full'
  issue: 1
  page: 157--166
  type: article-journal
  issued:
    year: 2015
- id: Touloumis2013
  title: "Hypothesis Testing for the Covariance Matrix in High-Dimensional Transposable Data with Kronecker Product Dependence Structure"
  author:
  - family: Touloumis
    given: Anestis
  - family: Marioni 
    given: John C.
  - family: Tavaré
    given: Simon  
  container-title: Statistica Sinica
  volume: 31
  URL: 'http://www3.stat.sinica.edu.tw/statistica/J31N3/J31N309/J31N309.html'
  issue: 3
  page: 1309--1329
  type: article-journal
  issued:
    year: 2021 
csl: Biometrics.csl  
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  tidy = TRUE,
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# HDTD: Analyzing High-Dimensional Transposable Data

[![Travis-CI Build Status](https://travis-ci.org/AnestisTouloumis/HDTD.svg?branch=master)](https://travis-ci.org/AnestisTouloumis/HDTD)
[![Project Status: Active The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) 


## Installation

You can install the release version of `HDTD`:

```{r eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("HDTD")
```

The source code for the release version of `HDTD` is available on Bioconductor at:

- http://bioconductor.org/packages/release/bioc/html/HDTD.html

Or you can install the development version of `HDTD`:

```{r eval=FALSE}
if (!requireNamespace("devtools", quietly = TRUE))
  install.packages("devtools")
devtools::install_github("AnestisTouloumis/HDTD")
```

To use `HDTD`, you should load the package as follows:

```{r}
library("HDTD")
```

## Usage

This package offers functions to estimate and test the matrix parameters of transposable data in high-dimensional settings. The term _transposable data_ refers to datasets that are structured in a matrix form such that both the rows and columns correspond to variables of interest and dependencies are expected to occur _among_ rows, _among_ columns and _between_ rows and columns. For example, consider microarray studies in genetics where multiple RNA samples across different tissues are available per subject. In this case, a data matrix can be created with row variables the genes, column variables the tissues and measurements the corresponding expression levels. We expect dependencies to occur among genes, among tissues and between genes and tissues. For more examples of transposable data see references in @Touloumis2013, @Touloumis2015 and @Touloumis2016.


There are four core functions:

- `meanmat.hat` to estimate the mean matrix of the transposable data,
- `meanmat.ts` to test the overall mean of the row (column) variables across groups of column (row) variables,
- `covmat.hat` to estimate the row and column covariance matrix,
- `covmat.ts` to test the sphericity, identity and diagonality hypothesis test for the row/column covariance matrix.


There are also three utility functions:

- `transposedata` for interchanging the role of rows and columns,
- `centerdata` for centering the transposable data around their mean matrix,
- `orderdata` for rearranging the order of the row and/or column variables.

## Example
We replicate the analysis that can be found in the vignette based on the mouse dataset 
```{r}
data(VEGFmouse)
```
This dataset contains expression levels for $40$ mice. For each mouse, the expression levels of $46$ genes (rows) that belong to the vascular endothelial growth factor signalling pathway were measured across $9$ tissues (adrenal gland, cerebrum, hippocampus, kidney, lung, muscle, spinal cord, spleen and thymus) that are displayed in the columns.

One can estimate the mean relationship of the gene expression levels across the $9$ tissues
```{r}
sample_mean <- meanmat.hat(datamat = VEGFmouse,N=40)
sample_mean
```
and test whether the overall gene expression is constant across the $9$ tissues:
```{r}
tissue_mean_test <- meanmat.ts(datamat = VEGFmouse,N=40,group.sizes=9)
tissue_mean_test
```
In this case, the overall gene expression is not conserved. 

To analyze the gene-wise and tissue-wise dependence structure, one needs to estimate the two covariance matrices:
```{r}
est_cov_mat <- covmat.hat(datamat=VEGFmouse,N=40)
est_cov_mat
```
Finally, the package allows users to perform hypothesis tests for the covariance matrix of the genes
```{r}
genes_cov_test <- covmat.ts(VEGFmouse,N=40)
genes_cov_test
```
and of the tissues:
```{r}
tissues_cov_test <- covmat.ts(VEGFmouse,N=40,voi="columns")
tissues_cov_test
```
At a $5\%$ significance level, it appears that the genes are correlated but we do not have enough evidence to reject the hypothesis that the tissues are uncorrelated. 

## Getting help
The statistical methods implemented in `HDTD` are described in @Touloumis2013, @Touloumis2015 and @Touloumis2016. Detailed examples of `HDTD` can be found in @Touloumis2016 or in the vignette: 

```{r, eval=FALSE}
browseVignettes("HDTD")
```


## How to cite

```{r, echo=FALSE, comment=""}
print(citation("HDTD"), bibtex = TRUE)
```

# References

Owner

  • Name: Anestis Touloumis
  • Login: AnestisTouloumis
  • Kind: user
  • Location: Brighton, UK
  • Company: University of Brighton

GitHub Events

Total
Last Year

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 103
  • Total Committers: 12
  • Avg Commits per committer: 8.583
  • Development Distribution Score (DDS): 0.515
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Anestis Touloumis A****s@b****k 50
Anestis Touloumis A****s@c****k 19
Nitesh Turaga n****a@g****m 10
Dan Tenenbaum d****a@f****g 8
Herve Pages h****s@f****g 4
Anestis Touloumis a****9@b****k 3
Hervé Pagès h****s@f****g 2
vobencha v****n@r****g 2
vobencha v****a@g****m 2
AnestisTouloumis a****s@b****k 1
LiNk-NY m****9@g****m 1
Marc Carlson m****n@f****g 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • bioconductor 18,188 total
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 5
  • Total maintainers: 1
bioconductor.org: HDTD

Statistical Inference about the Mean Matrix and the Covariance Matrices in High-Dimensional Transposable Data (HDTD)

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 18,188 Total
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 16.9%
Downloads: 50.7%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 4.1 depends
  • Rcpp >= 1.0.1 imports
  • stats * imports
  • knitr * suggests
  • rmarkdown * suggests