TREG

Tools for finding Total RNA Expression Genes in single nucleus RNA-seq data

https://github.com/lieberinstitute/treg

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    1 of 7 committers (14.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.9%) to scientific vocabulary

Keywords

bioconductor deconvolution rnascope rstats scrna-seq smfish snrna-seq treg

Keywords from Contributors

bioconductor-package gene grna-sequence ontology proteomics sequencing genomics single-cell deseq2 human-cell-atlas
Last synced: 6 months ago · JSON representation

Repository

Tools for finding Total RNA Expression Genes in single nucleus RNA-seq data

Basic Info
Statistics
  • Stars: 4
  • Watchers: 4
  • Forks: 2
  • Open Issues: 2
  • Releases: 2
Topics
bioconductor deconvolution rnascope rstats scrna-seq smfish snrna-seq treg
Created over 4 years ago · Last pushed about 1 year ago
Metadata Files
Readme Changelog Contributing Code of conduct Support

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>",
    fig.path = "man/figures/README-",
    out.width = "100%"
)
```

# TREG TREG website


[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![Bioc release status](http://www.bioconductor.org/shields/build/release/bioc/TREG.svg)](https://bioconductor.org/checkResults/release/bioc-LATEST/TREG)
[![Bioc devel status](http://www.bioconductor.org/shields/build/devel/bioc/TREG.svg)](https://bioconductor.org/checkResults/devel/bioc-LATEST/TREG)
[![Bioc downloads rank](https://bioconductor.org/shields/downloads/release/TREG.svg)](http://bioconductor.org/packages/stats/bioc/TREG/)
[![Bioc support](https://bioconductor.org/shields/posts/TREG.svg)](https://support.bioconductor.org/tag/TREG)
[![Bioc history](https://bioconductor.org/shields/years-in-bioc/TREG.svg)](https://bioconductor.org/packages/release/bioc/html/TREG.html#since)
[![Bioc last commit](https://bioconductor.org/shields/lastcommit/devel/bioc/TREG.svg)](http://bioconductor.org/checkResults/devel/bioc-LATEST/TREG/)
[![Bioc dependencies](https://bioconductor.org/shields/dependencies/release/TREG.svg)](https://bioconductor.org/packages/release/bioc/html/TREG.html#since)
[![Codecov test coverage](https://codecov.io/gh/LieberInstitute/TREG/branch/devel/graph/badge.svg)](https://codecov.io/gh/LieberInstitute/TREG?branch=devel)
[![R build status](https://github.com/LieberInstitute/TREG/actions/workflows/check-bioc.yml/badge.svg)](https://github.com/LieberInstitute/TREG/actions/workflows/check-bioc.yml)
[![GitHub issues](https://img.shields.io/github/issues/LieberInstitute/TREG)](https://github.com/LieberInstitute/TREG/issues)
[![GitHub pulls](https://img.shields.io/github/issues-pr/LieberInstitute/TREG)](https://github.com/LieberInstitute/TREG/pulls)
[![DOI](https://zenodo.org/badge/391101988.svg)](https://zenodo.org/badge/latestdoi/391101988)


The goal of `TREG` is to help find candidate **Total RNA Expression Genes (TREGs)**
in single nucleus (or single cell) RNA-seq data.

_**Note**: TREG is pronounced as a single word and fully capitalized, unlike [Regulatory T cells](https://en.wikipedia.org/wiki/Regulatory_T_cell), which are known as "Tregs" (pronounced "T-regs"). The work described here is unrelated to regulatory T cells._

### Why are TREGs useful?
The expression of a TREG is proportional to the the overall RNA expression in a
cell. This relationship can be used to estimate total RNA content in cells in 
assays where only a few genes can be measured, such as single-molecule 
fluorescent in situ hybridization (smFISH). 

In a smFISH experiment the number of TREG puncta can be used to infer the total
RNA expression of the cell.

The motivation of this work is to collect data via smFISH in to help build better 
deconvolution algorithms. But may be many other application for TREGs in 
experimental design!

![The Expression of a TREG can inform total RNA content of a cell](man/figures/TREG_cartoon.png){width=50%}

### What makes a gene a good TREG? 1. The gene must have **non-zero expression in most cells** across different tissue and cell types. 2. A TREG should also be expressed at a constant level in respect to other genes across different cell types or have **high rank invariance**. 3. Be **measurable as a continuous metric** in the experimental assay, for example have a dynamic range of puncta when observed in RNAscope. This will need to be considered for the candidate TREGs, and may need to be validated experimentally.

![Distribution of ranks of a gene of High and Low Invariance](man/figures/fig1_rank_violin_demo.png){width=30%}

### How to find candidate TREGs with `TREG`

![Overview of the Rank Invariance Process](man/figures/RI_flow.png){width=100%}

1. **Filter for low Proportion Zero genes snRNA-seq dataset:** This is facilitated with the functions `get_prop_zero()` and `filter_prop_zero()`. snRNA-seq data is notoriously sparse, these functions enrich for genes with more universal expression. 2. **Evaluate genes for Rank Invariance** The nuclei are grouped only by cell type. Within each cell type, the mean expression for each gene is ranked, the result is a vector (length is the number of genes), using the function `rank_group()`. Then the expression of each gene is ranked for each nucleus,the result is a matrix (the number of nuclei x number of genes), using the function `rank_cells()`.Then the absolute difference between the rank of each nucleus and the mean expression is found, from here the mean of the differences for each gene is calculated, then ranked. These steps are repeated for each group, the result is a matrix of ranks, (number of cell types x number of genes). From here the sum of the ranks for each gene are reversed ranked, so there is one final value for each gene, the “Rank Invariance” The genes with the highest rank-invariance are considered good candidates as TREGs. This is calculated with `rank_invariance_express()`. **This full process is implemented by: `rank_invariance_express()`.** ## Installation instructions Get the latest stable `R` release from [CRAN](http://cran.r-project.org/). Then install `TREG` using from [Bioconductor](http://bioconductor.org/) the following code: ```{r 'install', eval = FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) { install.packages("BiocManager") } BiocManager::install("TREG") ``` And the development version from [GitHub](https://github.com/LieberInstitute/TREG) with: ```{r 'install_dev', eval = FALSE} BiocManager::install("LieberInstitute/TREG") ``` ## Example ```{r `libraries`, message = FALSE, warning=FALSE} ## Load packages library("TREG") ``` ### Proportion Zero Filtering A TREG gene should be expressed in almost every cell. The set of genes should be filtered by maximum Proportion Zero within a groups of cells. ```{r calc_prop_zero, eval = requireNamespace('TREG')} ## Calculate Proportion Zero in groups defined by a column in colData (prop_zero <- get_prop_zero(sce = sce_zero_test, group_col = "cellType")) ## Get list of genes that pass the max Proportion Zero filter (filtered_genes <- filter_prop_zero(prop_zero, cutoff = 0.9)) ## Filter sce object to this list of genes sce_filter <- sce_zero_test[filtered_genes, ] ``` ### Evaluate RI for Filtered SCE Data The genes with the highest Rank Invariance are considered good candidates as TREGs. In this example the gene *g0* would be the strongest candidate TREG. ```{r run_RI, eval = requireNamespace('TREG')} ## Get the Rank Invariance value for each gene ## The highest values are the best TREG candidates ri <- rank_invariance_express(sce_filter) sort(ri, decreasing = TRUE) ``` ## Citation Below is the citation output from using `citation('TREG')` in R. Please run this yourself to check for any updates on how to cite __TREG__. ```{r 'citation', eval = requireNamespace('TREG')} print(citation("TREG"), bibtex = TRUE) ``` Please note that the `TREG` was only made possible thanks to many other R and bioinformatics software authors, which are cited either in the vignettes and/or the paper(s) describing this package. ## Code of Conduct Please note that the `TREG` project is released with a [Contributor Code of Conduct](http://bioconductor.org/about/code-of-conduct/). By contributing to this project, you agree to abide by its terms. ## Development tools * Continuous code testing is possible thanks to [GitHub actions](https://www.tidyverse.org/blog/2020/04/usethis-1-6-0/) through `r BiocStyle::CRANpkg('usethis')`, `r BiocStyle::CRANpkg('remotes')`, and `r BiocStyle::CRANpkg('rcmdcheck')` customized to use [Bioconductor's docker containers](https://www.bioconductor.org/help/docker/) and `r BiocStyle::Biocpkg('BiocCheck')`. * Code coverage assessment is possible thanks to [codecov](https://codecov.io/gh) and `r BiocStyle::CRANpkg('covr')`. * The [documentation website](http://LieberInstitute.github.io/TREG) is automatically updated thanks to `r BiocStyle::CRANpkg('pkgdown')`. * The code is styled automatically thanks to `r BiocStyle::CRANpkg('styler')`. * The documentation is formatted thanks to `r BiocStyle::CRANpkg('devtools')` and `r BiocStyle::CRANpkg('roxygen2')`. For more details, check the `dev` directory. This package was developed using `r BiocStyle::Biocpkg('biocthis')`.

Owner

  • Name: Lieber Institute for Brain Development
  • Login: LieberInstitute
  • Kind: organization
  • Email: info@libd.org
  • Location: Baltimore, MD

GitHub Events

Total
  • Push event: 4
Last Year
  • Push event: 4

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 155
  • Total Committers: 7
  • Avg Commits per committer: 22.143
  • Development Distribution Score (DDS): 0.323
Past Year
  • Commits: 19
  • Committers: 4
  • Avg Commits per committer: 4.75
  • Development Distribution Score (DDS): 0.421
Top Committers
Name Email Commits
Louise Huuki l****i@g****m 105
Leonardo Collado Torres l****r@g****m 42
J Wokaty j****y@s****u 2
Nitesh Turaga n****a@g****m 2
J Wokaty j****y 2
Louise Huuki 3****i 1
Laurent Gatto l****o@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 12
  • Total pull requests: 1
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 23 minutes
  • Total issue authors: 2
  • Total pull request authors: 1
  • Average comments per issue: 0.75
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • lcolladotor (10)
  • lahuuki (2)
Pull Request Authors
  • lgatto (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • bioconductor 6,079 total
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 5
  • Total maintainers: 1
bioconductor.org: TREG

Tools for finding Total RNA Expression Genes in single nucleus RNA-seq data

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 6,079 Total
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Forks count: 14.4%
Stargazers count: 21.8%
Average: 25.9%
Downloads: 93.1%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/check-bioc.yml actions
  • JamesIves/github-pages-deploy-action releases/v4 composite
  • actions/cache v3 composite
  • actions/checkout v3 composite
  • actions/upload-artifact master composite
  • docker/build-push-action v4 composite
  • docker/login-action v2 composite
  • docker/setup-buildx-action v2 composite
  • docker/setup-qemu-action v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
DESCRIPTION cran
  • R >= 4.2 depends
  • SummarizedExperiment * depends
  • Matrix * imports
  • purrr * imports
  • rafalib * imports
  • BiocFileCache * suggests
  • BiocStyle * suggests
  • RefManageR * suggests
  • SingleCellExperiment * suggests
  • dplyr * suggests
  • ggplot2 * suggests
  • knitr * suggests
  • pheatmap * suggests
  • rmarkdown * suggests
  • sessioninfo * suggests
  • testthat >= 3.0.0 suggests
  • tibble * suggests
  • tidyr * suggests