Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: lconde-ucl
  • License: mit
  • Language: HTML
  • Default Branch: master
  • Size: 27.5 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 1
  • Releases: 1
Created about 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation

README.md

DGE2

Nextflow run with conda run with docker run with singularity Launch on Nextflow Tower

Introduction

DGE2 is a nextflow pipeline built using code and infrastructure developed and maintained by the nf-core initative. It was developed to perform differential gene expression analysis after the data has been preprocessed with the nf-core/rnaseq pipeline (v3+) with default star_salmon alignment.

  1. Takes salmon quantification files and a metadata file as input
  2. Performs differential gene expression analysis over a specific design or if one is not specified, over all possible designs from the metadata file
  3. Generates summary plots (PCA, volcano, heatmap) and txt files, as well as a summary HTML report
  4. Runs gene set enrichment analysis on the preRanked list of genes from the DGE results

Usage

[!NOTE] If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

If you have run the nf-core/rnaseq pipeline with default aligner (star/salmon), you should have a results/star_salmon/ folder with several additional folders and files, including a quant.sf file for each sample, plus a tx2gene.tsv file with the correspondence between transcript and gene identifiers:

results/star_salmon/SAMPLE_1/quant.sf results/star_salmon/SAMPLE_2/quant.sf results/star_salmon/SAMPLE_3/quant.sf results/star_salmon/SAMPLE_4/quant.sf results/star_salmon/SAMPLE_5/quant.sf results/star_salmon/SAMPLE_6/quant.sf results/star_salmon/tx2gene.tsv [... other files and folders...]

In the above example, you would pass the results/ folder to the DGE2 pipeline using the --inputdir argument

Additionally, you will need to prepare a metadata.txt file that looks as follows:

SampleID Levels Status SAMPLE_1 high ctr SAMPLE_2 high ctr SAMPLE_3 med ctr SAMPLE_4 low case SAMPLE_5 low case SAMPLE_6 low case

This should be a txt file where the first column are the sample IDs, and the other (1 or more) columns displays the conditions for each sample. The samples must match those in the results/star_salmon inputdir.

Now, you can run the pipeline using:

bash nextflow run lconde-ucl/DGE2 \ -profile <docker/singularity/.../institute> \ --inputdir <PATH/TO/INPUTDIR/> \ --metadata <PATH/TO/METADATA> \ --outdir <OUTDIR>

For more details and further functionality, please refer to the usage documentation

Pipeline output

The pipeline produces text files and plots with the DGE and GSEA results, as well as an HTML report that contains a summary of the DGE results. For more details about the output files and reports, please refer to the output documentation.

Credits

DGE2 was developed by Lucia Conde in 2024. This is a DSL2 version of an older (DSL1) DGE pipeline developed in 2019

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

Citations

This pipeline uses code and infrastructure developed and maintained by the nf-core initative, and reused here under the MIT license.

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

Additional references of tools and data used in this pipeline are in CITATIONS

Owner

  • Name: Lucia Conde
  • Login: lconde-ucl
  • Kind: user
  • Company: University College London

Citation (CITATIONS.md)

# DGE2: Citations

## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)

> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.

## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.

## Pipeline tools

- [GSEA](https://www.gsea-msigdb.org/gsea/index.jsp)

  > Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545-15550.

- [DESeq2](https://pubmed.ncbi.nlm.nih.gov/25516281/)

  > Love MI, Huber W, Anders S (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15(12):550. PubMed PMID: 25516281; PubMed Central PMCID: PMC4302049.

- [R](https://www.R-project.org/)

  > R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

- [ReportingTools](https://bioconductor.org/packages/release/bioc/html/ReportingTools.html)

  > Huntley MA, et al. “ReportingTools: an automated result processing and presentation toolkit for high throughput genomic analyses.” Bioinformatics, Volume 29, Issue 24, December 2013, Pages 3220–3221. doi:10.1093/bioinformatics/btt551.

  
## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)

  > Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web.

- [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/)

  > Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.

- [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)

  > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.

- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

  > Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241.

- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)

  > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.

GitHub Events

Total
  • Create event: 2
  • Issues event: 2
  • Release event: 2
  • Watch event: 2
  • Delete event: 1
  • Issue comment event: 9
Last Year
  • Create event: 2
  • Issues event: 2
  • Release event: 2
  • Watch event: 2
  • Delete event: 1
  • Issue comment event: 9

Dependencies

.github/workflows/branch.yml actions
  • mshick/add-pr-comment v2 composite
.github/workflows/ci.yml actions
  • actions/checkout v4 composite
  • nf-core/setup-nextflow v1 composite
.github/workflows/clean-up.yml actions
  • actions/stale v9 composite
.github/workflows/download_pipeline.yml actions
  • actions/setup-python v5 composite
  • eWaterCycle/setup-singularity v7 composite
  • nf-core/setup-nextflow v1 composite
.github/workflows/fix-linting.yml actions
  • actions/checkout b4ffde65f46336ab88eb53be808477a3936bae11 composite
  • actions/setup-python 0a5c61591373683505ea898e09a3ea4f39ef2b9c composite
  • peter-evans/create-or-update-comment 71345be0265236311c031f5c7866368bd1eff043 composite
.github/workflows/linting.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
  • actions/upload-artifact v4 composite
  • nf-core/setup-nextflow v1 composite
.github/workflows/linting_comment.yml actions
  • dawidd6/action-download-artifact v3 composite
  • marocchino/sticky-pull-request-comment v2 composite
.github/workflows/release-announcements.yml actions
  • actions/setup-python v5 composite
  • rzr/fediverse-action master composite
  • zentered/bluesky-post-action v0.1.0 composite
modules/nf-core/custom/dumpsoftwareversions/meta.yml cpan
pyproject.toml pypi
modules/nf-core/custom/dumpsoftwareversions/environment.yml conda
  • multiqc 1.19.*