TDAstats

TDAstats: R pipeline for computing persistent homology in topological data analysis - Published in JOSS (2018)

https://github.com/rrrlw/tdastats

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 13 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: arxiv.org, medrxiv.org, sciencedirect.com, springer.com, wiley.com, mdpi.com, joss.theoj.org, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

cran data-science ggplot2 homology homology-calculations homology-computation joss persistent-homology pipeline r r-package r-packages ripser tda topological-data-analysis topology topology-visualization visualization

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 40% confidence
Engineering Computer Science - 40% confidence
Last synced: 6 months ago · JSON representation

Repository

R pipeline for computing persistent homology in topological data analysis. See https://doi.org/10.21105/joss.00860 for more details.

Basic Info
Statistics
  • Stars: 41
  • Watchers: 6
  • Forks: 9
  • Open Issues: 4
  • Releases: 4
Topics
cran data-science ggplot2 homology homology-calculations homology-computation joss persistent-homology pipeline r r-package r-packages ripser tda topological-data-analysis topology topology-visualization visualization
Created almost 8 years ago · Last pushed about 4 years ago
Metadata Files
Readme Changelog Contributing License Code of conduct

README.md

TDAstats: topological data analysis in R

Travis-CI Build Status AppVeyor Build Status Coverage Status

License: GPL v3 CRAN version CRAN Downloads

JOSS DOI Zenodo DOI

Overview

TDAstats is an R pipeline for computing persistent homology in topological data analysis.

Installation

To install TDAstats, run the following R code: ```r

install from CRAN

install.packages("TDAstats")

install development version from GitHub

devtools::install_github("rrrlw/TDAstats")

install development version with vignettes/tutorials

devtools::installgithub("rrrlw/TDAstats", buildvignettes = TRUE) ```

Sample code

The following sample code creates two synthetic datasets, and calculates and visualizes their persistent homology to showcase the use of TDAstats.

```r

load TDAstats

library("TDAstats")

load sample datasets

data("unif2d") data("circle2d")

calculate persistent homology for both datasets

unif.phom <- calculatehomology(unif2d, dim = 1) circ.phom <- calculatehomology(circle2d, dim = 1)

visualize first dataset as persistence diagram

plot_persist(unif.phom)

visualize second dataset as topological barcode

plot_barcode(circ.phom) ```

A more detailed tutorial can be found in the package vignettes or at this Gist.

Functionality

TDAstats has 3 primary goals:

  1. Calculation of persistent homology: the C++ Ripser project is a lightweight library for calculating persistent homology that outpaces all of its competitors. Given the importance of computational efficiency, TDAstats naturally uses Ripser behind the scenes for homology calculations, using the Rcpp package to integrate the C++ code into an R pipeline (Ripser for R).

  2. Statistical inference of persistent homology: persistent homology can be used in hypothesis testing to compare the topological structure of two point clouds. TDAstats uses a permutation test in conjunction with the Wasserstein metric for nonparametric statistical inference.

  3. Visualization of persistent homology: persistent homology is visualized using two types of plots - persistence diagrams and topological barcodes. TDAstats provides implementations of both plot types using the ggplot2 framework. Having ggplot2 underlying the plots confers many advantages to the user, including generation of publication-quality plots and customization using the ggplot object returned by TDAstats.

Contribute

To contribute to TDAstats, you can create issues for any bugs/suggestions on the issues page. You can also fork the TDAstats repository and create pull requests to add features you think will be useful for users.

Citation

If you use TDAstats, please consider citing the following (based on use): * General use of TDAstats: Wadhwa RR, Williamson DFK, Dhawan A, Scott JG. TDAstats: R pipeline for computing persistent homology in topological data analysis. Journal of Open Source Software. 2018; 3(28): 860. doi: 10.21105/joss.00860 * TDAstats to calculate persistent homology (Ripser): Bauer U. Ripser: Efficient computation of Vietoris-Rips persistence barcodes. 2019; arXiv: 1908.02518. * TDAstats to perform statistical test: Robinson A, Turner K. Hypothesis testing for topological data analysis. J Appl Comput Topol. 2017; 1: 241.

Real-world applications, use cases, and mentions

  • Stenseke J. Persistent homology and the shape of evolutionary games. Journal of Theoretical Biology. 2021; 531: 110903. Link to paper.
  • Torres-Espin A, Haefeli J, Ehsanian R, et al. Topological network analysis of patient similarity for precision management of acute blood pressure in spinal cord injury. eLife. 2021; 10: e68015. Link to paper.
  • Somasundaram E, Litzler A, Wadhwa R, Owen S, Scott J. Persistent homology of tumor CT scans is associated with survival in lung cancer. Medical Physics. 2021; 48(11): 7043-7051. Link to paper and preprint.
  • Richardson M, Verma R, Singhania A, Tabone O, Das M, Rodrigue M, Leissner P, Woltmann G, Cooper A, O'Garra A, Haldar P. Blood transcriptional phenotypes of progressive latent M. tuberculosis infection inform novel signatures that improve prediction of tuberculosis risk. Cell Reports Medicine. 2021. Link to paper.
  • Perez-Moraga R, Fores-Martos J, Suay-Garcia B, Duval J-L, Falco A, Climent J. A COVID-19 Drug Repurposing Strategy through Quantitative Homological Similarities Using a Topological Data Analysis-Based Framework. Pharmaceutics. 2021; 13(4): 488. Link to paper.
  • Kandanaarachchi S, Hyndman RJ. Leave-one-out kernel density estimates for outlier detection. Monash University. 2021. Link to paper.
  • Somasundaram EV, Brown SE, Litzler A, Scott JG, Wadhwa RR. Benchmarking R packages for calculation of persistent homology. R Journal. 2021; 13(1): 184-193. Link to paper.
  • Brochard A, Blaszczyszyn B, Mallat S, Zhang S. Particle gradient descent model for point process generation. 2020. arXiv:2010.14928. Link to preprint.
  • Nguyen DQN, Xing L, Lin L. Community detection, pattern recognition, and hypergraph-based learning: approches using metric geometry and persistent homology. 2020. arXiv:2010.00435. Link to preprint.
  • Pinto GVF. Motivic constructions on graphs and networks with stability results. Doctoral Thesis: Universidade Estadual Paulista Rio Claro & Ohio State University. 2020. Link to thesis.
  • Gommel M. A Machine Learning Exploration of Topological Data Analysis Applied to Low and High Dimensional fMRI Data. Doctoral Thesis: University of Iowa. 2019. doi: 10.17077/etd.005247. Link to thesis.
  • Mémoli F, Singhal K. A Primer on Persistent Homology of Finite Metric Spaces. Bulletin of Mathematical Biology. 2019; 81(7): 2074. Links to paper and preprint
  • Srinivasan R, Chander A. Understanding Bias in Datasets using Topological Data Analysis. Fujitsu Laboratories of America. 2019. Link
  • Kough D, Neuzil M, Simpson C, Glover R. Analyzing State of the Union Addresses using Topology. University of St. Thomas. 2019. Link
  • Rickert J. A Mathematician's Perspective on Topological Data Analysis and R. 2018. Link
  • Blog post on Data Management
  • Analyzing finance data
  • R package for visualizing persistent homology

Owner

  • Name: Raoul
  • Login: rrrlw
  • Kind: user
  • Location: Cleveland, Ohio

Medicine, mathematics, programming.

JOSS Publication

TDAstats: R pipeline for computing persistent homology in topological data analysis
Published
August 08, 2018
Volume 3, Issue 28, Page 860
Authors
Raoul R. Wadhwa ORCID
Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
Drew F.k. Williamson ORCID
Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
Andrew Dhawan ORCID
Neurological Institute, Cleveland Clinic Foundation, Cleveland, OH 44195, USA
Jacob G. Scott ORCID
Department of Translational Hematology and Oncology Research, Cleveland Clinic Foundation, Cleveland, OH 44195, USA
Editor
Thomas J. Leeper ORCID
Tags
topological data analysis persistent homology Vietoris-Rips complex statistical resampling

Papers & Mentions

Total mentions: 2

A COVID-19 Drug Repurposing Strategy through Quantitative Homological Similarities Using a Topological Data Analysis-Based Framework
Last synced: 4 months ago
TDAstats: R pipeline for computing persistent homology in topological data analysis
Last synced: 4 months ago

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 195
  • Total Committers: 4
  • Avg Commits per committer: 48.75
  • Development Distribution Score (DDS): 0.056
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Raoul Wadhwa r****a@g****m 184
peekxc m****k@g****m 8
Shota Ochi s****0@g****m 2
Thomas J. Leeper t****r@g****m 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 19
  • Total pull requests: 7
  • Average time to close issues: 3 months
  • Average time to close pull requests: about 2 months
  • Total issue authors: 8
  • Total pull request authors: 5
  • Average comments per issue: 2.32
  • Average comments per pull request: 2.14
  • Merged pull requests: 6
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • rrrlw (7)
  • corybrunson (6)
  • ShotaOchi (1)
  • peekxc (1)
  • lindasheila (1)
  • mlnjsh (1)
  • SPRADA1 (1)
  • sarahsamorodnitsky (1)
Pull Request Authors
  • peekxc (2)
  • rrrlw (2)
  • leeper (1)
  • ShotaOchi (1)
  • shaelebrown (1)
Top Labels
Issue Labels
enhancement (9) bug (4) documentation (2)
Pull Request Labels

Packages

  • Total packages: 3
  • Total downloads:
    • cran 433 last-month
  • Total dependent packages: 4
    (may contain duplicates)
  • Total dependent repositories: 4
    (may contain duplicates)
  • Total versions: 14
  • Total maintainers: 1
proxy.golang.org: github.com/rrrlw/tdastats
  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.5%
Dependent repos count: 5.7%
Last synced: 6 months ago
proxy.golang.org: github.com/rrrlw/TDAstats
  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.5%
Dependent repos count: 5.7%
Last synced: 6 months ago
cran.r-project.org: TDAstats

Pipeline for Topological Data Analysis

  • Versions: 4
  • Dependent Packages: 4
  • Dependent Repositories: 4
  • Downloads: 433 Last month
Rankings
Forks count: 7.9%
Stargazers count: 8.8%
Dependent packages count: 9.3%
Average: 11.1%
Dependent repos count: 14.6%
Downloads: 15.0%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.3 depends
  • Rcpp >= 0.12.15 imports
  • ggplot2 >= 2.2.1 imports
  • covr * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • testthat >= 2.0.0 suggests