bayNorm

Normalization for single cell RNA-seq data

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
○
Academic publication links
✓
Committers with academic emails
3 of 6 committers (50.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.3%) to scientific vocabulary

Keywords

normalization rpackage scrnaseq

Keywords from Contributors

bioconductor-package genomics proteomics bioconductor differential-expression chip-seq epigenomics feature-detection peak-detection microbiome

Last synced: 6 months ago · JSON representation

Repository

Normalization for single cell RNA-seq data

Basic Info

Host: GitHub
Owner: WT215
Language: R
Default Branch: master
Homepage:
Size: 2.6 MB

Statistics

Stars: 9
Watchers: 6
Forks: 3
Open Issues: 0
Releases: 0

Topics

normalization rpackage scrnaseq

Created about 8 years ago · Last pushed over 3 years ago

Metadata Files

Readme

bayNorm

bayNorm is an R package which is used to normalize single-cell RNA-seq data.

The bayNorm in Julia language is now available: https://github.com/WT215/bayNormJL.jl.

Code for bayNorm paper

The code for producing figures in bayNorm paper [1] can be found here

Installation

Make sure to use the latest version of bayNorm by installing it from GitHub.

R library(devtools) devtools::install_github("WT215/bayNorm")

bayNorm has been submitted to Bioconductor, once it is accepted, it can be installed via: R library(BiocManager) BiocManager::install("bayNorm")

Quick start: for either single or groups of cells

The main function is bayNorm, which is a wrapper function for gene specific prior parameter estimation and normalization. The input is a matrix of scRNA-seq data with rows different genes and coloums different cells. The output is either point estimates from posterior (2D array) or samples from posterior (3D array).

Essential input and parameters for running bayNorm are:

Data: a SummarizedExperiment object or matrix (rows: genes, columns: cells).
BETA_vec: a vector of probabilities which is of length equal to the number of cells.
Conditions: If Conditions is provided, prior parameters will be estimated within each group of cells (we name this kind of procedure as "LL" procedure where "LL" stands for estimating both $\mu$ and $\phi$ locally). Otherwise, bayNorm applied "GG" procedure for estimating prior parameters (estimating both $\mu$ and $\phi$ globally).
Prior_type: Even if you have specified the Conditions, you can still choose to estimate prior parameters across all the cells by setting Prior_type="GG".

```R data('EXAMPLEDATAlist') rse <- SummarizedExperiment::SummarizedExperiment(assays=SimpleList(counts=EXAMPLEDATAlist$inputdata[,seq(1,30)]))

SingleCellExperiment object can also be input in bayNorm:

rse <- SingleCellExperiment::SingleCellExperiment(assays=list(counts=EXAMPLEDATAlist$inputdata))

Return 3D array normalzied data, draw 20 samples from posterior distribution:

bayNorm3D<-bayNorm( Data=rse, BETAvec = NULL, modeversion=FALSE, meanversion = FALSE,S=20 ,verbose =FALSE, parallel = TRUE)

Return 2D matrix normalized data (MAP of posterior):

Simply set modeversion=TRUE, but keep meanversion=FALSE

Return 2D matrix normalized data (mean of posterior):

Simply set meanversion=TRUE, but keep modeversion=FALSE

```

Non-UMI scRNAseq dataset

bayNorm's mathematical model is suitable for UMI dataset. However it can be also applied on non-UMI dataset. In bayNorm, you need to specify the following parameter: * UMI_sffl: bayNorm can also be applied on the non-UMI dataset. However, user need to provide a scaling factor. Raw data will be divided by the scaled number and bayNorm will be applied on the rounded scaled data. This scaling factor can be interpreted as the average number of times original mRNA molecules were sequenced after PCR amplification. It is chosen so that the Dropout vs Mean expression plots to be close to assymptotic expression ($e^{-mean}$).

Output 3D array or 2D array with existing estimated prior parameters.

If you have run bayNorm on a dataset to estimate prior parameters, but want to output new posterior estimates (3D or 2D array), you can use the function bayNorm_sup. It is important to input the existing estimated parameters by specifying the following parameter in bayNorm_sup: * BETA_vec: If Conditions has been specified previously, then input unlist(bayNorm_output$BETA) * PRIORS: input bayNorm_output$PRIORS * input_params: input bayNorm_output$input_params

```R data('EXAMPLEDATAlist')

Return 3D array normalzied data:

bayNorm3D<-bayNorm( Data=EXAMPLEDATAlist$inputdata, BETAvec = EXAMPLEDATAlist$inputbeta, modeversion=FALSE, meanversion = FALSE)

Now if you want to generate 2D matrix (MAP) using the same prior

estimates as generated before:

bayNorm2D<-bayNormsup( Data=EXAMPLEDATAlist$inputdata, PRIORS=bayNorm3D$PRIORS, inputparams=bayNorm3D$inputparams, modeversion=TRUE, meanversion = FALSE)

Or you may want to generate 2D matrix

(mean of posterior) using the same prior

estimates as generated before:

bayNorm2D<-bayNormsup( Data=EXAMPLEDATAlist$inputdata, PRIORS=bayNorm3D$PRIORS, inputparams=bayNorm3D$inputparams, modeversion=FALSE, meanversion = TRUE) ```

Work around Seurat object: clusters detection

```R library(bayNorm) library(Seurat)

data('EXAMPLEDATAlist')

library(Seurat) bayout<-bayNorm(EXAMPLEDATAlist$inputdata,meanversion = TRUE) x.seurat <- CreateSeuratObject(counts =bayout$Bayout,assay = 'bayNorm')

x.seurat <- NormalizeData(x.seurat)

x.seurat <- ScaleData(x.seurat) x.seurat <- FindVariableFeatures(x.seurat)

Specifying: assay='bayNorm'

x.seurat <- RunPCA(x.seurat, features = x.seurat@assays$bayNorm@var.features, pcs.compute = 20,assay='bayNorm')

x.seurat <- RunUMAP(x.seurat, dims = 1:20,assay='bayNorm')

x.seurat <- JackStraw(x.seurat, prop.freq = 0.06)

x.seurat <- FindNeighbors(x.seurat, dims = 1:20) x.seurat <- FindClusters(x.seurat, resolution = 0.5) head(Idents(x.seurat), 5)

It is a toy example. Here only one cluster was found

plot(x.seurat@reductions$umap@cell.embeddings,pch=16,col=as.factor(Idents(x.seurat)))

Double check that the assay we used comes from bayNorm

x.seurat@reductions$pca@assay.used ```

References

[1] Tang et al. (2019). Bioinformatics.

Owner

Name: Wenhao
Login: WT215
Kind: user

Repositories: 1
Profile: https://github.com/WT215

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Committers

Last synced: about 2 years ago

All Time

Total Commits: 126
Total Committers: 6
Avg Commits per committer: 21.0
Development Distribution Score (DDS): 0.119

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
WENHAO TANG	w**5@i**k	111
Wenhao	W**5@i**k	6
Nitesh Turaga	n**a@g**m	4
vobencha	v**a@g**m	2
Vahid Shahrezaei	v**i@i**k	2
Kayla-Morrell	k**l@r**g	1

Committer Domains (Top 20 + Academic)

ic.ac.uk: 2 roswellpark.org: 1 imperial.ac.uk: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 2
Total pull requests: 0
Average time to close issues: 16 days
Average time to close pull requests: N/A
Total issue authors: 2
Total pull request authors: 0
Average comments per issue: 29.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

elissonlopes (1)
kvshams (1)

Pull Request Authors

Top Labels

Issue Labels

enhancement (1) good first issue (1)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- bioconductor 15,863 total

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 5
Total maintainers: 1

bioconductor.org: bayNorm

Single-cell RNA sequencing data normalization

Homepage: https://github.com/WT215/bayNorm
Documentation: https://bioconductor.org/packages/release/bioc/vignettes/bayNorm/inst/doc/bayNorm.pdf
License: GPL (>= 2)
Latest release: 1.26.0
published 10 months ago

Versions: 5
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 15,863 Total

Rankings

Dependent repos count: 0.0%

Dependent packages count: 0.0%

Average: 21.6%

Downloads: 64.7%

Maintainers (1)

wt215@ic.ac.uk

Last synced: 6 months ago

bayNorm

Science Score: 23.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

bayNorm

Code for bayNorm paper

Installation

Quick start: for either single or groups of cells

SingleCellExperiment object can also be input in bayNorm:

rse <- SingleCellExperiment::SingleCellExperiment(assays=list(counts=EXAMPLEDATAlist$inputdata))

Return 3D array normalzied data, draw 20 samples from posterior distribution:

Return 2D matrix normalized data (MAP of posterior):

Simply set modeversion=TRUE, but keep meanversion=FALSE

Return 2D matrix normalized data (mean of posterior):

Simply set meanversion=TRUE, but keep modeversion=FALSE

Non-UMI scRNAseq dataset

Output 3D array or 2D array with existing estimated prior parameters.

Return 3D array normalzied data:

Now if you want to generate 2D matrix (MAP) using the same prior

estimates as generated before:

Or you may want to generate 2D matrix

(mean of posterior) using the same prior

estimates as generated before:

Work around Seurat object: clusters detection

x.seurat <- NormalizeData(x.seurat)

Specifying: assay='bayNorm'

x.seurat <- JackStraw(x.seurat, prop.freq = 0.06)

It is a toy example. Here only one cluster was found

Double check that the assay we used comes from bayNorm

References

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

bioconductor.org: bayNorm

Rankings

Maintainers (1)