sigminer

🌲 An easy-to-use and scalable toolkit for genomic alteration signature (a.k.a. mutational signature) analysis and visualization in R https://shixiangwang.github.io/sigminer/reference/index.html

https://github.com/shixiangwang/sigminer

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    1 of 7 committers (14.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.0%) to scientific vocabulary

Keywords

bayesian-nmf bioinformatics cancer-research cnv copynumber-signatures cosmic-signatures dbs easy-to-use indel mutational-signatures nmf nmf-extraction r sbs signature-extraction somatic-mutations somatic-variants visualization
Last synced: 6 months ago · JSON representation

Repository

🌲 An easy-to-use and scalable toolkit for genomic alteration signature (a.k.a. mutational signature) analysis and visualization in R https://shixiangwang.github.io/sigminer/reference/index.html

Basic Info
Statistics
  • Stars: 154
  • Watchers: 2
  • Forks: 19
  • Open Issues: 12
  • Releases: 35
Topics
bayesian-nmf bioinformatics cancer-research cnv copynumber-signatures cosmic-signatures dbs easy-to-use indel mutational-signatures nmf nmf-extraction r sbs signature-extraction somatic-mutations somatic-variants visualization
Created about 7 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog License

README.Rmd

---
output: github_document
---



```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%",
  warning = FALSE
)
```

# Sigminer: Mutational Signature Analysis and Visualization in R logo

[![CRAN status](https://www.r-pkg.org/badges/version/sigminer)](https://cran.r-project.org/package=sigminer) 
[![lifecycle](https://img.shields.io/badge/lifecycle-stable-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html) 
[![R-CMD-check](https://github.com/ShixiangWang/sigminer/workflows/R-CMD-check/badge.svg)](https://github.com/ShixiangWang/sigminer/actions)
[![](https://cranlogs.r-pkg.org/badges/grand-total/sigminer?color=orange)](https://cran.r-project.org/package=sigminer) [![Closed issues](https://img.shields.io/github/issues-closed/ShixiangWang/sigminer.svg)](https://github.com/ShixiangWang/sigminer/issues?q=is%3Aissue+is%3Aclosed)
![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat-square)
[![Anaconda-Server Badge](https://anaconda.org/bioconda/r-sigminer/badges/downloads.svg)](https://anaconda.org/bioconda/r-sigminer)
[![check in Biotreasury](https://img.shields.io/badge/Biotreasury-collected-brightgreen)](https://biotreasury.rjmart.cn/#/tool?id=10043) 

## :bar_chart: Overview

The cancer genome is shaped by various mutational processes over its lifetime, stemming from exogenous and cell-intrinsic DNA damage, and error-prone DNA replication, leaving behind characteristic mutational spectra, termed **mutational signatures**. This package, **sigminer**, helps users to extract, analyze and visualize signatures from genome alteration records, thus providing new insight into cancer study.

For pipeline tool, please see its co-evolutionary CLI [sigflow](https://github.com/ShixiangWang/sigflow).

**SBS signatures**:

```{r, fig.width=12, fig.height=5, echo=FALSE, message=FALSE}
library(sigminer)
load(system.file("extdata", "toy_mutational_signature.RData",
  package = "sigminer", mustWork = TRUE
))
# Show signature profile
p1 <- show_sig_profile(sig2, mode = "SBS", style = "cosmic", x_label_angle = 90)
p1

```

**Copy number signatures**:

```{r fig.width=12, fig.height=5, echo=FALSE}
load(system.file("extdata", "toy_copynumber_signature_by_W.RData",
  package = "sigminer", mustWork = TRUE
))
# Show signature profile
p2 <- show_sig_profile(sig,
  style = "cosmic",
  mode = "copynumber",
  method = "W",
  normalize = "feature"
)
p2
```

```{r fig.width=12, fig.height=5, echo=FALSE}
show_sig_profile(get_sig_db("CNS_TCGA")$db[, 1:3], style = "cosmic", 
                 mode = "copynumber", method = "S", check_sig_names = FALSE)
```

**DBS signatures**:

```{r fig.width=12, fig.height=5, echo=FALSE}
DBS = system.file("extdata", "DBS_signatures.rds",
                        package = "sigminer", mustWork = TRUE)
DBS = readRDS(DBS)

# Show signature profile
p3 <- show_sig_profile(DBS$db[, 1:3] %>% as.matrix(), mode = "DBS", style = "cosmic", check_sig_names = FALSE)
p3
```

**INDEL (i.e. ID) signatures**:

```{r fig.width=12, fig.height=5, echo=FALSE}
ID = system.file("extdata", "ID_signatures.rds",
                        package = "sigminer", mustWork = TRUE)
ID = readRDS(ID)

# Show signature profile
p4 <- show_sig_profile(ID$db[, 4:6] %>% as.matrix(), mode = "ID", style = "cosmic", check_sig_names = FALSE)
p4
```

**Genome rearrangement signatures**:

```{r fig.width=10, fig.height=5, echo=FALSE, message=FALSE}
p5 <- show_cosmic_sig_profile(sig_index = c("R1", "R2", "R3"), sig_db = "RS_Nik_lab", style = "cosmic", show_index = FALSE)
p5
```


### :airplane: Features

-   supports a standard *de novo* pipeline for identification of **5** types of signatures: copy number, SBS, DBS, INDEL and RS (genome rearrangement signature).
-   supports quantify exposure for one sample based on *known signatures*.
-   supports association and group analysis and visualization for signatures.
-   supports two types of signature exposures: relative exposure (relative contribution of signatures in each sample) and absolute exposure (estimated variation records of signatures in each sample).
-   supports basic summary and visualization for profile of mutation (powered by **maftools**) and copy number.
-   supports parallel computation by R packages **foreach**, **future** and **NMF**.
-   efficient code powered by R packages **data.table** and **tidyverse**.
-   elegant plots powered by R packages **ggplot2**, **ggpubr**, **cowplot** and **patchwork**.
-   well tested by R package **testthat** and documented by R package **roxygen2**, **roxytest**, **pkgdown**, and etc. for both reliable and reproducible research.


## :arrow_double_down: Installation

You can install the stable release of **sigminer** from CRAN with:

```{r, eval=FALSE}
install.packages("BiocManager")
BiocManager::install("sigminer", dependencies = TRUE)
```

You can install the development version of **sigminer** from Github with:

```{r, eval=FALSE}
remotes::install_github("ShixiangWang/sigminer", dependencies = TRUE)
# For Chinese users, run 
remotes::install_git("https://gitee.com/ShixiangWang/sigminer", dependencies = TRUE)
```

You can also install **sigminer** from conda `bioconda` channel with

```sh
# Please note version number of the bioconda release

# You can install an individual environment firstly with
# conda create -n sigminer
# conda activate sigminer
conda install -c bioconda -c conda-forge r-sigminer
```

## :beginner: Usage

A complete documentation of **sigminer** can be read online at . All functions are well organized and documented at . For usage of a specific function `fun`, run `?fun` in your R console to see its documentation.

## :question: QA

### How to install the `copynumber` package

For some extra features provided by **sigminer**, **copynumber** package is required. Due to the removal of the **copynumber** package from Bioc, I had to remove it from the dependencies in v2.2.0. You can install the package from . It is generally recommended as I have added some features, although other forks of this package exist on GitHub.

```r
remotes::install_github("ShixiangWang/copynumber")
```

## :paperclip: Citation

If you use **sigminer** in academic field, please at least cite one of the following papers.

------------------------------------------------------------------------

-   ***Wang S, Li H, Song M, Tao Z, Wu T, He Z, et al. (2021) Copy number signature analysis tool and its application in prostate cancer reveals distinct mutational processes and clinical outcomes. PLoS Genet 17(5): e1009557.*** 
-   ***Wang, S., Tao, Z., Wu, T., & Liu, X. S. (2021). Sigflow: an automated and comprehensive pipeline for cancer genome mutational signature analysis. Bioinformatics, 37(11), 1590-1592***. 
-   ***Ziyu Tao, Shixiang Wang, Chenxu Wu, Tao Wu, Xiangyu Zhao, Wei Ning, Guangshuai Wang, Jinyu Wang, Jing Chen, Kaixuan Diao, Fuxiang Chen, Xue-Song Liu, The repertoire of copy number alteration signatures in human cancer, Briefings in Bioinformatics, 2023, bbad053***. 

------------------------------------------------------------------------

## :arrow_down: Download Stats

```{r, echo=FALSE, warning=FALSE, message=FALSE, dpi=300}
library("ggplot2")
library("cranlogs")
library("dplyr")
library("spiralize")
library("ComplexHeatmap")
library("lubridate")

df <- cran_downloads("sigminer", from = "2019-07-01", to = "2025-08-22")

if (!is.null(df)) {
  day_diff = as.double(df$date[nrow(df)] - df$date[1], "days")
  year_mean = tapply(df$count, lubridate::year(df$date), function(x) mean(x[x > 0]))
  
  df$diff = log2(df$count/year_mean[as.character(lubridate::year(df$date))])
  df$diff[is.infinite(df$diff)] = 0
  q = quantile(abs(df$diff), 0.99)  # adjust outliers
  df$diff[df$diff > q] = q
  df$diff[df$diff < -q] = -q
  
  spiral_initialize_by_time(xlim = range(df[, 1]), padding = unit(2, "cm"), verbose = FALSE)
  
  spiral_track(height = 0.8)
  spiral_horizon(df$date, df$diff, use_bars = TRUE)
  
  spiral_highlight("start", "2019-12-31", type = "line", gp = gpar(col = 1))
  spiral_highlight("2020-01-01", "2020-12-31", type = "line", gp = gpar(col = 2))
  spiral_highlight("2021-01-01", "2021-12-31", type = "line", gp = gpar(col = 3))
  spiral_highlight("2022-01-01", "2022-12-31", type = "line", gp = gpar(col = 4))
  spiral_highlight("2023-01-01", "2023-12-31", type = "line", gp = gpar(col = 5))
  spiral_highlight("2024-01-01", "2024-12-31", type = "line", gp = gpar(col = 6))
  spiral_highlight("2025-01-01", "end",        type = "line", gp = gpar(col = 7))
  
  s = current_spiral()
  d = seq(15, 360, by = 30) %% 360
  for(i in seq_along(d)) {
    foo = polar_to_cartesian(d[i]/180*pi, (s$max_radius + 1)*1.05)
    grid.text(month.name[i], x = foo[1, 1], y = foo[1, 2], default.unit = "native",
              rot = ifelse(d[i] > 0 & d[i] < 180, d[i] - 90, d[i] + 90), gp = gpar(fontsize = 10))
  }
  
  lgd = packLegend(
    Legend(title = "Difference to\nyearly average", at = c("higher", "lower"),
           legend_gp = gpar(fill = c("#D73027", "#313695"))),
    Legend(title = "Year", type = "lines", at = 2019:2025,
           legend_gp = gpar(col = 1:7))
  )
  draw(lgd, x = unit(1, "npc") + unit(10, "mm"), just = "left")
}
```

## :page_with_curl: References

Please properly cite the following references when you are using any corresponding features. The references are also listed in the function documentation. Very thanks to the works, **sigminer** cannot be created without the giants.

1.  Mayakonda, Anand, et al. "Maftools: efficient and comprehensive analysis of somatic variants in cancer." Genome research 28.11 (2018): 1747-1756.
2.  Gaujoux, Renaud, and Cathal Seoighe. "A Flexible R Package for Nonnegative Matrix Factorization."" BMC Bioinformatics 11, no. 1 (December 2010).
3.  H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
4.  Kim, Jaegil, et al. "Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors." Nature genetics 48.6 (2016): 600.
5.  Alexandrov, Ludmil B., et al. "Deciphering signatures of mutational processes operative in human cancer." Cell reports 3.1 (2013): 246-259.
6.  Degasperi, Andrea, et al. "A practical framework and online tool for mutational signature analyses show intertissue variation and driver dependencies." Nature cancer 1.2 (2020): 249-263.
7.  Alexandrov, Ludmil B., et al. "The repertoire of mutational signatures in human cancer." Nature 578.7793 (2020): 94-101.
8.  Macintyre, Geoff, et al. "Copy number signatures and mutational processes in ovarian carcinoma." Nature genetics 50.9 (2018): 1262.
9.  Tan, Vincent YF, and Cédric Févotte. "Automatic relevance determination in nonnegative matrix factorization with the/spl beta/-divergence." IEEE Transactions on Pattern Analysis and Machine Intelligence 35.7 (2012): 1592-1605.
10. Bergstrom EN, Huang MN, Mahto U, Barnes M, Stratton MR, Rozen SG, Alexandrov LB: SigProfilerMatrixGenerator: a tool for visualizing and exploring patterns of small mutational events. BMC Genomics 2019, 20:685 

## :page_facing_up: LICENSE

The software is made available for non commercial research purposes only under the [MIT](https://github.com/ShixiangWang/sigminer/blob/master/LICENSE.md). However, notwithstanding any provision of the MIT License, the software currently may not be used for commercial purposes without explicit written permission after contacting patents' authors.

Related patents:

- **CN202011516653.7** `https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/127042`

MIT © 2019-Present Shixiang Wang, Xue-Song Liu

MIT © 2018 Anand Mayakonda

------------------------------------------------------------------------

Sigminer v1-v2 are supported by [**Cancer Biology Group**](https://github.com/XSLiuLab) **\@ShanghaiTech**


![Alt](https://repobeats.axiom.co/api/embed/7cd2cf8a196dde9d8d1e13c9b23bc2f157d8254e.svg "Repobeats analytics image")

Owner

  • Name: Shixiang Wang (王诗翔)
  • Login: ShixiangWang
  • Kind: user
  • Location: Guangzhou, China
  • Company: SYSUCC

纯素之道,惟神是守。守而勿失,与神为一

GitHub Events

Total
  • Issues event: 4
  • Watch event: 10
  • Issue comment event: 4
  • Push event: 1
  • Fork event: 1
Last Year
  • Issues event: 4
  • Watch event: 10
  • Issue comment event: 4
  • Push event: 1
  • Fork event: 1

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 1,148
  • Total Committers: 7
  • Avg Commits per committer: 164.0
  • Development Distribution Score (DDS): 0.028
Past Year
  • Commits: 4
  • Committers: 2
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.25
Top Committers
Name Email Commits
ShixiangWang w****x@s****n 1,116
taoziyu97 b****i@s****m 16
lihm1 4****1 9
wutao 1****5@q****m 4
selkamand s****d@g****m 1
Qi ZHAO z****i@s****n 1
Shixiang Wang (王诗翔) w****g@i****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 113
  • Total pull requests: 18
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 23 hours
  • Total issue authors: 38
  • Total pull request authors: 5
  • Average comments per issue: 3.85
  • Average comments per pull request: 0.56
  • Merged pull requests: 17
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 5
  • Pull requests: 0
  • Average time to close issues: 2 days
  • Average time to close pull requests: N/A
  • Issue authors: 4
  • Pull request authors: 0
  • Average comments per issue: 1.2
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ShixiangWang (40)
  • xiw588 (12)
  • lincj1994 (7)
  • selkamand (7)
  • wangshun1121 (4)
  • RuimeiL (3)
  • joonan30 (3)
  • rajeha (3)
  • jrcodina (2)
  • KirsieMin (2)
  • shenhaizhongdechanrao (2)
  • tangwei1129 (2)
  • mhjiang97 (1)
  • ShepherdZh (1)
  • songeric1107 (1)
Pull Request Authors
  • ShixiangWang (9)
  • taoziyu97 (4)
  • lihm1 (3)
  • selkamand (1)
  • likelet (1)
Top Labels
Issue Labels
enhancement (18) bug (8) good first issue (7) question (5) help wanted (3) QA (3) doc (3) 存档 (1) bioconda (1) Plan (1)
Pull Request Labels

Packages

  • Total packages: 3
  • Total downloads:
    • cran 740 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 124
  • Total maintainers: 1
proxy.golang.org: github.com/shixiangwang/sigminer
  • Versions: 45
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago
proxy.golang.org: github.com/ShixiangWang/sigminer
  • Versions: 45
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago
cran.r-project.org: sigminer

Extract, Analyze and Visualize Mutational Signatures for Genomic Variations

  • Versions: 34
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 740 Last month
Rankings
Stargazers count: 3.3%
Forks count: 4.4%
Average: 15.4%
Downloads: 17.0%
Dependent repos count: 23.8%
Dependent packages count: 28.7%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.5 depends
  • NMF * imports
  • Rcpp * imports
  • cli >= 2.0.0 imports
  • cowplot * imports
  • data.table * imports
  • dplyr * imports
  • furrr >= 0.2.0 imports
  • future * imports
  • ggplot2 >= 3.3.0 imports
  • ggpubr * imports
  • maftools * imports
  • magrittr * imports
  • methods * imports
  • purrr * imports
  • rlang >= 0.1.2 imports
  • stats * imports
  • tidyr * imports
  • BSgenome * suggests
  • BSgenome.Hsapiens.UCSC.hg19 * suggests
  • Biobase * suggests
  • Biostrings * suggests
  • GenSA * suggests
  • GenomicRanges * suggests
  • IRanges * suggests
  • R.utils * suggests
  • RColorBrewer * suggests
  • UCSCXenaTools * suggests
  • circlize * suggests
  • cluster * suggests
  • copynumber * suggests
  • covr * suggests
  • digest * suggests
  • ggalluvial * suggests
  • ggcorrplot * suggests
  • ggfittext * suggests
  • ggplotify * suggests
  • ggrepel * suggests
  • knitr * suggests
  • lpSolve * suggests
  • markdown * suggests
  • matrixStats * suggests
  • nnls * suggests
  • parallel * suggests
  • patchwork * suggests
  • pheatmap * suggests
  • quadprog * suggests
  • reticulate * suggests
  • rmarkdown * suggests
  • roxygen2 * suggests
  • scales * suggests
  • synchronicity * suggests
  • testthat * suggests
  • tibble * suggests
.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite