phip-seq-tp-vac

Pipeline for analysis of the Treponema pallidum PhiP-Seq assay data @ Greninger Lab

https://github.com/dariiavyshenska/phip-seq-tp-vac

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Pipeline for analysis of the Treponema pallidum PhiP-Seq assay data @ Greninger Lab

Basic Info
  • Host: GitHub
  • Owner: DariiaVyshenska
  • License: mit
  • Language: Nextflow
  • Default Branch: master
  • Homepage:
  • Size: 434 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 2 years ago · Last pushed 11 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation

README.md

Phip-seq TP vac Nextflow pipeline

Nextflow run with docker run with singularity

Introduction

nf-core/phipseqtpvac is a bioinformatics pipeline designed to analyze Treponema pallidum PhiP-Seq assay data. It processes FASTQ files using a provided sample sheet, performing quality control (QC), trimming, and (pseudo-)alignment. The pipeline outputs a count matrix in CSV format, where rows represent features from the PhiP-Seq library and columns correspond to the input samples, providing a comprehensive quantification of feature counts across all samples.

  1. Adapter and quality trimming (Cutadapt)
  2. Pseudoalignment and quantification (Kallisto)
  3. Aggregation of Kallisto quantifications into a unified count matrix (custom Python script)

Usage

[!NOTE] If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

First, prepare a samplesheet with your input data that looks as follows:

samplesheet.csv:

csv sample,fastq_1,fastq_2 CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz

Each row corresponds to a single sample (one pull-down) with its associated pair of FASTQ files for paired-end sequencing. Example samplesheet can be found in the test_input directory along with the test data.

Now, you can run the pipeline using:

bash nextflow run nf-core/phipseqtpvac \ -profile <docker/singularity/...> \ --input example_samplesheet.csv \ --outdir <OUTDIR>

To run a specific verion of the pipeline, use the appropriate tag: nextflow run nf-core/phipseqtpvac -r <tag>

[!WARNING] Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

The pipeline parameters and their default values:

| Parameter | Description | Default value | | ------------- | ------------- | ------------- | | input | Path to the samplesheet with input data | Required | | outdir | Directory to save results | ./results | | cutadapt_minimum_len | Minimum read length after Cutadapt trimming | 20 | | trimR1 | Bases removed from the start of Forward reads (Cutadapt) | 44 | | trimR2 | Bases removed from the start of Reverse reads (Cutadapt) | 66 | | kal_index_ref | Path to the Kallisto index file | See latest release assets archive | | target_keys | Path to the PhiP-Seq library keys | See latest release assets archive |

kal_index_ref and target_keys files are provided in the assets archive of the latest release. The kal_index_ref file is a Kallisto index file for the PhiP-Seq library, and the target_keys file contains the PhiP-Seq library keys. The pipeline uses these files to quantify the PhiP-Seq library features.

Pipeline output

The pipeline generates separate nested output directories for results from Cutadapt, Kallisto, and the custom Kallisto output parsing program (cutadapt_out, kallisto_out, and parsed_raw_counts, respectively). Each directory contains the relevant quality control, trimming, or quantification files. A unified count matrix (parsed_raw_counts/kallisto_raw_counts_merged.csv) is produced, summarizing feature counts across all input samples.

Credits

nf-core/phipseqtpvac was originally written by @DariiaVyshenska.

We thank the following people for their extensive assistance in the development of this pipeline: Thaddeus Armstrong, Ben Wieland, Alex Greninger.

Contributions and Support

If you’re interested in contributing to this pipeline, please reach out to the repository owner.

Citations

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

Owner

  • Name: Dariia Vyshenska
  • Login: DariiaVyshenska
  • Kind: user
  • Location: Seattle, WA, USA

Citation (CITATIONS.md)

# nf-core/phipseqtpvac: Citations

## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)

> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.

## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.

## Pipeline tools

- [Cutadapt](https://cutadapt.readthedocs.io/en/stable/)

  > Martin, M. (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal, 17(1), pp. 10-12. doi:https://doi.org/10.14806/ej.17.1.200

- [Kallisto](https://pachterlab.github.io/kallisto/)

  > Nicolas L Bray, Harold Pimentel, Páll Melsted and Lior Pachter, Near-optimal probabilistic RNA-seq quantification, Nature Biotechnology 34, 525–527 (2016), doi:10.1038/nbt.3519

## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)

  > Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web.

- [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/)

  > Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.

- [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)

  > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.

- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

  > Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241.

- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)

  > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.

GitHub Events

Total
  • Release event: 2
  • Push event: 11
  • Create event: 2
Last Year
  • Release event: 2
  • Push event: 11
  • Create event: 2

Dependencies

.github/workflows/awsfulltest.yml actions
  • actions/upload-artifact v4 composite
  • seqeralabs/action-tower-launch v2 composite
.github/workflows/awstest.yml actions
  • actions/upload-artifact v4 composite
  • seqeralabs/action-tower-launch v2 composite
.github/workflows/branch.yml actions
  • mshick/add-pr-comment b8f338c590a895d50bcbfa6c5859251edc8952fc composite
.github/workflows/ci.yml actions
  • actions/checkout b4ffde65f46336ab88eb53be808477a3936bae11 composite
  • jlumbroso/free-disk-space 54081f138730dfa15788a46383842cd2f914a1be composite
  • nf-core/setup-nextflow v1 composite
.github/workflows/clean-up.yml actions
  • actions/stale 28ca1036281a5e5922ead5184a1bbf96e5fc984e composite
.github/workflows/download_pipeline.yml actions
  • actions/setup-python 0a5c61591373683505ea898e09a3ea4f39ef2b9c composite
  • eWaterCycle/setup-singularity 931d4e31109e875b13309ae1d07c70ca8fbc8537 composite
  • nf-core/setup-nextflow v1 composite
.github/workflows/fix-linting.yml actions
  • actions/checkout b4ffde65f46336ab88eb53be808477a3936bae11 composite
  • actions/setup-python 0a5c61591373683505ea898e09a3ea4f39ef2b9c composite
  • peter-evans/create-or-update-comment 71345be0265236311c031f5c7866368bd1eff043 composite
.github/workflows/linting.yml actions
  • actions/checkout b4ffde65f46336ab88eb53be808477a3936bae11 composite
  • actions/setup-python 0a5c61591373683505ea898e09a3ea4f39ef2b9c composite
  • actions/upload-artifact 5d5d22a31266ced268874388b861e4b58bb5c2f3 composite
  • nf-core/setup-nextflow v1 composite
.github/workflows/linting_comment.yml actions
  • dawidd6/action-download-artifact f6b0bace624032e30a85a8fd9c1a7f8f611f5737 composite
  • marocchino/sticky-pull-request-comment 331f8f5b4215f0445d3c07b4967662a32a2d3e31 composite
.github/workflows/release-announcements.yml actions
  • actions/setup-python 0a5c61591373683505ea898e09a3ea4f39ef2b9c composite
  • rzr/fediverse-action master composite
  • zentered/bluesky-post-action 80dbe0a7697de18c15ad22f4619919ceb5ccf597 composite
modules/nf-core/fastqc/meta.yml cpan
modules/nf-core/multiqc/meta.yml cpan
subworkflows/nf-core/utils_nextflow_pipeline/meta.yml cpan
subworkflows/nf-core/utils_nfcore_pipeline/meta.yml cpan
subworkflows/nf-core/utils_nfvalidation_plugin/meta.yml cpan
pyproject.toml pypi
modules/nf-core/fastqc/environment.yml pypi
modules/nf-core/multiqc/environment.yml pypi