https://github.com/carmonalab/genenmf

Methods to discover gene programs on single-cell data

https://github.com/carmonalab/genenmf

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: biorxiv.org, nature.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

Methods to discover gene programs on single-cell data

Basic Info
  • Host: GitHub
  • Owner: carmonalab
  • Language: R
  • Default Branch: master
  • Size: 636 KB
Statistics
  • Stars: 142
  • Watchers: 2
  • Forks: 8
  • Open Issues: 9
  • Releases: 4
Created about 2 years ago · Last pushed 8 months ago
Metadata Files
Readme

README.md

GeneNMF: unsupervised discovery of recurrent gene programs in multi-sample single-cell omics data

Non-negative matrix factorization is a method for the analysis of high dimensional data that allows extracting sparse and meaningful features from a set of non-negative data vectors. It is well suited for decomposing scRNA-seq data, effectively reducing large complex matrices ($10^4$ of genes times $10^5$ of cells) into a few interpretable gene programs. It has been especially used to extract recurrent gene programs in cancer cells (see e.g. Barkely et al. (2022) and Gavish et al. (2023)), which are otherwise difficult to integrate and analyse jointly.

GeneNMF is a package that implements methods for matrix factorization and gene program discovery for single-cell omics data. It can be applied directly on Seurat objects to reduce the dimensionality of the data and to detect robust gene programs across multiple samples. For fast NMF calculation, GeneNMF relies on RcppML (see DeBruine et al. 2024).

Installation

Install release version from CRAN: {r} install.packages("GeneNMF") Or for the latest version, install from GitHub: {r} library(remotes) remotes::install_github("carmonalab/GeneNMF")

Test your installation

{r} library(GeneNMF) data(sampleObj) sampleObj <- runNMF(sampleObj, k=5)

Meta programs discovery using default parameters

Perform NMF over a list of Seurat objects and for multiple values of k (number of NMF factors) to extract gene programs {r} sampleObj.list <- Seurat::SplitObject(sampleObj, split.by = "donor") geneNMF.programs <- multiNMF(sampleObj.list, k=4:9) Cluster gene programs from multiple samples and k's into meta-programs (MPs), i.e. consensus programs that are robustly identified across NMF runs. Compute MP metrics and most influencial MP genes. {r} geneNMF.metaprograms <- getMetaPrograms(geneNMF.programs, nMP=5)

GeneNMF demos

Find demos of the functionalities of GeneNMF and more explanations in the following tutorials:

For the source code, see the GeneNMF.demo repository.

News: version 0.6 is here

We made some improvements to the algorithm to allow more robust identification of metaprograms, and easier tuning of parameters. Here are the main changes: * We updated how meta-programs (MPs) are calculated from individual programs. Instead of extracting gene sets for each program and then calculating a consensus, we keep the full vector of gene weights and calculate cosine similarities between the vectors. Consensus gene weights are then calculated as the average over all programs in a MP. * To impose sparsity in the decomposition, we include a specificity.weight parameter, which is used to re-normalize NMF loadings based on how specific a gene is for a given program. * To determine the number of genes to be included in a MP, we calculate the cumulative distribution for the gene weights in a given MP. Only genes that cumulatively explain up to a fraction of the total weight (weight.explained parameter) are included in the MP gene set. * The definition and default of min.confidence has changed. The confidence of a gene in a given MP is calculated as the fraction of programs in which the gene has been determined to be part of the invidual program (using weight.explained=0.8). * The parameter nprograms in the function getMetaPrograms() has been renamed to nMP, to avoid confusion * New defaults: expression matrices are now by default not scaled or centered (the behavior can be altered using the scale and center parameters)

Citation

If you used GeneNMF in your work, please cite:

Wounding triggers invasive progression in human basal cell carcinoma. Laura Yerly, Massimo Andreatta, Josep Garnica, Jeremy Di Domizio, Michel Gilliet, Santiago J Carmona, Francois Kuonen. bioRxiv 2024 10.1101/2024.05.31.596823

Owner

  • Name: Cancer Systems Immunology Lab
  • Login: carmonalab
  • Kind: organization
  • Location: Lausanne, Switzerland

At Ludwig Cancer Research Lausanne and Department of Oncology, University of Lausanne & Swiss Institute of Bioinformatics

GitHub Events

Total
  • Create event: 1
  • Release event: 1
  • Issues event: 35
  • Watch event: 58
  • Issue comment event: 45
  • Push event: 24
  • Pull request event: 7
  • Fork event: 5
Last Year
  • Create event: 1
  • Release event: 1
  • Issues event: 35
  • Watch event: 58
  • Issue comment event: 45
  • Push event: 24
  • Pull request event: 7
  • Fork event: 5

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 32
  • Total pull requests: 4
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 9 days
  • Total issue authors: 28
  • Total pull request authors: 1
  • Average comments per issue: 1.13
  • Average comments per pull request: 1.5
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 21
  • Pull requests: 4
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 9 days
  • Issue authors: 19
  • Pull request authors: 1
  • Average comments per issue: 0.9
  • Average comments per pull request: 1.5
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • sandcell10 (2)
  • Phoenix12580 (2)
  • metaspine (2)
  • RorPy (2)
  • Li-ZhiD (1)
  • boluofen (1)
  • gabriel-pozo (1)
  • voluptatis (1)
  • ysbioinfo (1)
  • pankomah (1)
  • nandobonf (1)
  • JulieBvs (1)
  • UEATTOOMUCH260 (1)
  • huang-sh (1)
  • SenLiBio (1)
Pull Request Authors
  • maozhy3 (4)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 223 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 4
  • Total maintainers: 1
cran.r-project.org: GeneNMF

Non-Negative Matrix Factorization for Single-Cell Omics

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 223 Last month
Rankings
Dependent packages count: 28.1%
Forks count: 28.3%
Stargazers count: 31.3%
Dependent repos count: 36.1%
Average: 41.8%
Downloads: 85.0%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 4.3.0 depends
  • Matrix * imports
  • NMF * imports
  • RcppML * imports
  • Seurat >= 4.3.0 imports
  • stats * imports
  • dendextend * suggests
  • dplyr * suggests
  • fgsea * suggests
  • knitr * suggests
  • msigdbr * suggests
  • rmarkdown * suggests