Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 10 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: uel3
  • License: mit
  • Language: Nextflow
  • Default Branch: master
  • Size: 24.7 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 3
  • Releases: 0
Created over 1 year ago · Last pushed 7 months ago
Metadata Files
Readme Changelog Contributing License Citation

README.md

GitHub Actions CI Status GitHub Actions Linting StatusCite with Zenodo

Nextflow run with conda run with docker run with singularity Launch on Nextflow Tower

Introduction

uel3/t3pio is adapted from T3Pio which is an amplicon generation pipeline built for designing direct from stool amplicon sets for HMAS schemes

Description

uel3/t3pio takes as input annotated genomes from the bacterial species of interest. Core genes from the species are identified and primers are designed to generate amplicons compatible with the user’s chosen HMAS platform. Current settings allow up to 3 degenerate bases per 180-250 bp primer. <!-- TODO nf-core: Complete this sentence with a 2-3 sentence summary of what types of data the pipeline ingests, a brief overview of the major pipeline sections and the types of output it produces. You're giving an overview to someone new to nf-core here, in 15-20 seconds. For an example, see https://github.com/nf-core/rnaseq/blob/master/README.md#introduction -->

  1. Python3
  2. OrthoFinder (v2.1.2)
  3. MUSCLE (v3.8.1)
  4. TrimAl (v1.2)
  5. EMBOSS consambig (v.6.4.0)
  6. Primer3 (2.3.4)
  7. EMBOSS primersearch (v6.4.0)

Usage

Running T3pio requires Nextflow (>=21.10.3) and singulairity to be installed. There are detailed instructions below for Nextflow installation, including Nextflow's Bash and Java requirements. Currently, all required dependencies—except for Nextflow—are provided through Docker and Singularity images .

[!NOTE] If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow.

After Nextflow is installed, clone the pipeline:
bash git clone https://github.com/uel3/uel3-t3pio

Now, you can run the pipeline using:

bash nextflow run main.nf \ -profile singularity \ --input <path/to/gbk/files>(input genomes, in gbk format) \ --contig_file <path/to/contig_file(stool contigs file for filtering, fasta format)> \ --outdir <OUTDIR> \ --good_contig_list <path/to/good_contig_list_file(for filtering, these are true Salmonella contigs in this case)> \ --run_compare_primers (either true or false) \ --number_isolates (the number of isolates to be included in an orthogroup) \

To quickly test the pipeline with the bundled example dataset, run the following command:

bash nextflow run main.nf \ -profile singularity,test \

[!NOTE]
if either contig_file or good_contig_list is omitted, the pipeline will skip the filtering step, but will still proceed to generate the primer pool using the Primer3 process.

To run, test against existing MLST primers, turn on the legacyfilepath process and provide the path to existing MLST primers file from the CLI or in the nextflow.config as legacyfilepath:

bash nextflow run uel3/t3pio \ -profile <docker/singularity/.../institute> \ --run_compare_primers true \ --legacy_file_path <path/to/existing/MLST/primers/file> \ --input <path/to/gbk/files> \ --outdir <OUTDIR>

[!TIP]
Be mindful of the output file size generated by the pipeline. For reference, running the t3pio pipeline with a standard input of 19 GenBank genome files (totaling ~200 MB) produces approximately 1.4 GB of output. Using a smaller subset of 3 genomes (~29 MB in total) results in an output size of about 525 MB.

the flowchart for t3pio pipeline.

t3pio_flowchart_full

Primer Filtering Summary

In the primers folder of the output directory, you will find 5 files representing the primer lists at different stages of the filtering process:

  • concatenated_primers_primer3.txt — raw primer pool generated by Primer3
  • concatenated_primers_specificity.txt — after specificity testing using JSB data
  • concatenated_primers_snpfiltered.txt — after SNP redundancy filtering
  • concatenated_primers_final.txt — after primer-score filtering (retaining all lowest score primers in each orthogroup)
  • concatenated_primers_final_firstrow.txt — after primer-score filtering (keeping only the first lowest score primer in each orthogroup)

To summarize these files, run the following one-liner script inside the primers folder. It will output the legacy primer match count, unique oligo group count, and total primer count for each file—all on a single line per file:

bash for file in concatenated_primers_*; do awk -v fname="$file" 'NR==FNR {seen[$1]; next} ($4 in seen) {count++} END {printf "%s: legacy_primer match = %d, ", fname, count}' "$file" /scicomp/groups/OID/NCEZID/DFWED/EDLB/projects/CIMS/HMAS_pilot/step_mothur/HMAS-QC-Pipeline2/Sal_v1.0.oligo; cut -f1 "$file" | cut -f1 -d 'p' | sort | uniq | wc -l | awk '{printf "oligo group = %d, ", $1}'; wc -l < "$file" | awk '{printf "total primer count = %d\n", $1}'; done

[!WARNING] Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

Credits

uel3/t3pio was originally written by AJ Williams-Newkirk, S Lucking, R Jin, and adapted to nextflow by C Cole and R Jin.

We thank the following people for their extensive assistance in the development of this pipeline:

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

Citations

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

Owner

  • Name: Candace Cole
  • Login: uel3
  • Kind: user

Microbiologist venturing into bioinformatics. My current interests include metagenomics, pathogen detection/discovery, and AMR.

Citation (CITATIONS.md)

# uel3/t3pio: Citations

## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)

> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.

## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.

## Pipeline tools

- [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)

  > Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data [Online].

- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)

  > Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.

## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)

  > Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web.

- [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/)

  > Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.

- [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)

  > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.

- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

  > Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241.

- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)

  > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.

GitHub Events

Total
  • Create event: 6
  • Issues event: 5
  • Delete event: 5
  • Issue comment event: 6
  • Member event: 1
  • Public event: 2
  • Push event: 58
  • Pull request review comment event: 1
  • Pull request review event: 1
  • Pull request event: 17
Last Year
  • Create event: 6
  • Issues event: 5
  • Delete event: 5
  • Issue comment event: 6
  • Member event: 1
  • Public event: 2
  • Push event: 58
  • Pull request review comment event: 1
  • Pull request review event: 1
  • Pull request event: 17

Dependencies

.github/workflows/branch.yml actions
  • mshick/add-pr-comment v2 composite
.github/workflows/ci.yml actions
  • actions/checkout v4 composite
  • nf-core/setup-nextflow v1 composite
.github/workflows/clean-up.yml actions
  • actions/stale v9 composite
.github/workflows/download_pipeline.yml actions
  • actions/setup-python v5 composite
  • eWaterCycle/setup-singularity v7 composite
  • nf-core/setup-nextflow v1 composite
.github/workflows/fix-linting.yml actions
  • actions/checkout b4ffde65f46336ab88eb53be808477a3936bae11 composite
  • actions/setup-python 0a5c61591373683505ea898e09a3ea4f39ef2b9c composite
  • peter-evans/create-or-update-comment 71345be0265236311c031f5c7866368bd1eff043 composite
.github/workflows/linting.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
  • actions/upload-artifact v4 composite
  • nf-core/setup-nextflow v1 composite
.github/workflows/linting_comment.yml actions
  • dawidd6/action-download-artifact v3 composite
  • marocchino/sticky-pull-request-comment v2 composite
.github/workflows/release-announcements.yml actions
  • actions/setup-python v5 composite
  • rzr/fediverse-action master composite
  • zentered/bluesky-post-action v0.1.0 composite
modules/nf-core/custom/dumpsoftwareversions/meta.yml cpan
modules/nf-core/multiqc/meta.yml cpan
modules/nf-core/muscle/meta.yml cpan
modules/nf-core/orthofinder/meta.yml cpan
pyproject.toml pypi
modules/nf-core/custom/dumpsoftwareversions/environment.yml pypi
modules/nf-core/multiqc/environment.yml pypi
modules/nf-core/muscle/environment.yml pypi
modules/nf-core/orthofinder/environment.yml pypi