https://github.com/bioconductor-source/zfpkm

https://github.com/bioconductor-source/zfpkm

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: ncbi.nlm.nih.gov
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.4%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: bioconductor-source
  • License: gpl-3.0
  • Language: R
  • Default Branch: devel
  • Size: 89.8 KB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme Changelog License

README.md

zFPKM Transformation

Summary

Perform the zFPKM transform on RNA-seq FPKM data. This algorithm is based on the publication by Hart et al., 2013 (Pubmed ID 24215113). The reference recommends using zFPKM > -3 to select expressed genes. Validated with ENCODE open/closed promoter chromatin structure epigenetic data on six of the ENCODE cell lines. It works well for gene level data using FPKM or TPM, but does not appear to calibrate well for transcript level data.

Example

We calculate zFPKM for existing normalized FPKM from GSE94802.

```r library(dplyr) gse94802 <- "ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE94nnn/GSE94802/suppl/GSE94802MinkinaetalnormalizedFPKM.csv.gz" temp <- tempfile() download.file(gse94802, temp) fpkm <- read.csv(gzfile(temp), row.names=1) fpkm <- select(fpkm, -MGI_Symbol)

library(zFPKM) zfpkm <- zFPKMTransformDF(fpkm) ```

The zFPKMTransformDF function also optionally plots the Guassian fit to the FPKM data for which the z-scores are based.

To determine which genes are active across all samples, we use rowMeans() and a zFPKM cutoff of -3, as suggested by the authors.

r activeGenes <- which(rowMeans(zfpkm) > -3)


References

Hart T, Komori HK, LaMere S, Podshivalova K, Salomon DR. Finding the active genes in deep RNA-seq gene expression studies. BMC Genomics. 2013 Nov 11;14:778. doi: 10.1186/1471-2164-14-778.

Owner

  • Name: (WIP DEV) Bioconductor Packages
  • Login: bioconductor-source
  • Kind: organization
  • Email: maintainer@bioconductor.org

Source code for packages accepted into Bioconductor

GitHub Events

Total
Last Year

Dependencies

DESCRIPTION cran
  • R >= 3.4.0 depends
  • SummarizedExperiment * imports
  • checkmate * imports
  • dplyr * imports
  • ggplot2 * imports
  • tidyr * imports
  • GEOquery * suggests
  • edgeR * suggests
  • knitr * suggests
  • limma * suggests
  • printr * suggests
  • rmarkdown * suggests
  • stringr * suggests