MapperAlgo

Mapper Algorithm

https://github.com/kennywang112/mapperalgo

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Mapper Algorithm

Basic Info
  • Host: GitHub
  • Owner: kennywang112
  • License: other
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 976 KB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 5
Created almost 2 years ago · Last pushed 10 months ago
Metadata Files
Readme License

README.md

Topological Data Analysis: Mapper Algorithm

CRAN status mysql <!-- badges: end -->

Playground & Document

For a more detailed explanation for this package, this document will keep update for better understanding the source code. You can also try the playground I build to get familier with the algorithm
I've written some articles on Medium, which you can find here to get familiar with topological data analysis. I'll be continuously updating my work, and I welcome any feedback!

This package is based on the TDAmapper package by Paul Pearson. You can view the original package here. Since the original package hasn't been updated in over seven years, this version is focused on optimization. By incorporating vector computation into the Mapper algorithm, this package aims to significantly improve its performance.

Get started quickly

Mapper Step visualize from Skaf et al.

Mapper is basically a three-step process:

1. Cover: This step splits the data into overlapping intervals and creates a cover for the data.

2. Cluster: This step clusters the data points in each interval the cover creates.

3. Simplicial Complex: This step combines the two steps above, which connects the data points in the cover to create a simplicial complex.

you can know more about the basic here: Chazal, F., & Michel, B. (2021). An introduction to topological data analysis: fundamental and practical aspects for data scientists. Frontiers in artificial intelligence, 4, 667963.

Besides to the steps above, you can find the following code in the package:

  1. Mapper.R: Combining the three steps above
  2. ConvertLevelset.R: Converting a Flat Index to a Multi-index, or vice versa.
  3. EdgeVertices.R This is to find the nodes for plot, not for the Mapper algorithm.

Goals and Updates

Main Goals 1. Computational Optimization: The current version speeds up computations by 100 times compare to the original code, and could be faster by using num_cores.

  1. Expanded Clustering Methods: Clustering is a crucial component of the Mapper algorithm. In addition to hierarchical clustering, Other methods (K-means, DBscan, PAM) were added to this project.

Example

r Mapper <- MapperAlgo( filter_values = circle_data[,2:3], intervals = 4, percent_overlap = 30, methods = "dbscan", method_params = list(eps = 0.3, minPts = 5), cover_type = 'extension', num_cores = 12 ) MapperPlotter(Mapper, circle_data$circle, circle_data, type = "forceNetwork")

Circle
Figure 1
CircleMapper
Figure 2

Computation Performance

Figures 3 and 4 illustrate the impact of parallel computing introduced in Version 1.0.2 using the MNIST dataset.
Figure 3 visualizes the time taken for different sample sizes when reducing the input to two dimensions using PCA, demonstrating how parallel computing accelerates computation. Figure 4 keeps the sample size fixed while incrementally increasing the number of dimensions in each iteration. It clearly shows that the number of features used in filter functions significantly affects computing time.
You can find the code in Performance.R

Circle
Figure 3
CircleMapper
Figure 4

Owner

  • Name: Wang chi-chien
  • Login: kennywang112
  • Kind: user

GitHub Events

Total
  • Release event: 2
  • Delete event: 1
  • Push event: 16
  • Create event: 2
Last Year
  • Release event: 2
  • Delete event: 1
  • Push event: 16
  • Create event: 2

Packages

  • Total packages: 1
  • Total downloads:
    • cran 238 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 5
  • Total maintainers: 1
cran.r-project.org: MapperAlgo

Topological Data Analysis: Mapper Algorithm

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 238 Last month
Rankings
Dependent packages count: 28.1%
Dependent repos count: 34.7%
Average: 49.8%
Downloads: 86.6%
Maintainers (1)
Last synced: 10 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.1.2 depends
  • fastcluster * suggests
  • igraph * suggests
  • testthat >= 3.0.0 suggests