uel3-t3pio
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 10 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.5%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: uel3
- License: mit
- Language: Nextflow
- Default Branch: master
- Size: 24.7 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 3
- Releases: 0
Metadata Files
README.md
Introduction
uel3/t3pio is adapted from T3Pio which is an amplicon generation pipeline built for designing direct from stool amplicon sets for HMAS schemes
Description
uel3/t3pio takes as input annotated genomes from the bacterial species of interest. Core genes from the species are identified and primers are designed to generate amplicons compatible with the user’s chosen HMAS platform. Current settings allow up to 3 degenerate bases per 180-250 bp primer. <!-- TODO nf-core: Complete this sentence with a 2-3 sentence summary of what types of data the pipeline ingests, a brief overview of the major pipeline sections and the types of output it produces. You're giving an overview to someone new to nf-core here, in 15-20 seconds. For an example, see https://github.com/nf-core/rnaseq/blob/master/README.md#introduction -->
- Python3
- OrthoFinder (v2.1.2)
- MUSCLE (v3.8.1)
- TrimAl (v1.2)
- EMBOSS consambig (v.6.4.0)
- Primer3 (2.3.4)
- EMBOSS primersearch (v6.4.0)
Usage
Running T3pio requires Nextflow (>=21.10.3) and singulairity to be installed. There are detailed instructions below for Nextflow installation, including Nextflow's Bash and Java requirements. Currently, all required dependencies—except for Nextflow—are provided through Docker and Singularity images .
[!NOTE] If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow.
After Nextflow is installed, clone the pipeline:
bash
git clone https://github.com/uel3/uel3-t3pio
Now, you can run the pipeline using:
bash
nextflow run main.nf \
-profile singularity \
--input <path/to/gbk/files>(input genomes, in gbk format) \
--contig_file <path/to/contig_file(stool contigs file for filtering, fasta format)> \
--outdir <OUTDIR> \
--good_contig_list <path/to/good_contig_list_file(for filtering, these are true Salmonella contigs in this case)> \
--run_compare_primers (either true or false) \
--number_isolates (the number of isolates to be included in an orthogroup) \
To quickly test the pipeline with the bundled example dataset, run the following command:
bash
nextflow run main.nf \
-profile singularity,test \
[!NOTE]
if eithercontig_fileorgood_contig_listis omitted, the pipeline will skip the filtering step, but will still proceed to generate the primer pool using the Primer3 process.
To run, test against existing MLST primers, turn on the legacyfilepath process and provide the path to existing MLST primers file from the CLI or in the nextflow.config as legacyfilepath:
bash
nextflow run uel3/t3pio \
-profile <docker/singularity/.../institute> \
--run_compare_primers true \
--legacy_file_path <path/to/existing/MLST/primers/file> \
--input <path/to/gbk/files> \
--outdir <OUTDIR>
[!TIP]
Be mindful of the output file size generated by the pipeline. For reference, running the t3pio pipeline with a standard input of 19 GenBank genome files (totaling ~200 MB) produces approximately 1.4 GB of output. Using a smaller subset of 3 genomes (~29 MB in total) results in an output size of about 525 MB.
the flowchart for t3pio pipeline.
Primer Filtering Summary
In the primers folder of the output directory, you will find 5 files representing the primer lists at different stages of the filtering process:
concatenated_primers_primer3.txt— raw primer pool generated by Primer3concatenated_primers_specificity.txt— after specificity testing using JSB dataconcatenated_primers_snpfiltered.txt— after SNP redundancy filteringconcatenated_primers_final.txt— after primer-score filtering (retaining all lowest score primers in each orthogroup)concatenated_primers_final_firstrow.txt— after primer-score filtering (keeping only the first lowest score primer in each orthogroup)
To summarize these files, run the following one-liner script inside the primers folder. It will output the legacy primer match count, unique oligo group count, and total primer count for each file—all on a single line per file:
bash
for file in concatenated_primers_*; do awk -v fname="$file" 'NR==FNR {seen[$1]; next} ($4 in seen) {count++} END {printf "%s: legacy_primer match = %d, ", fname, count}' "$file" /scicomp/groups/OID/NCEZID/DFWED/EDLB/projects/CIMS/HMAS_pilot/step_mothur/HMAS-QC-Pipeline2/Sal_v1.0.oligo; cut -f1 "$file" | cut -f1 -d 'p' | sort | uniq | wc -l | awk '{printf "oligo group = %d, ", $1}'; wc -l < "$file" | awk '{printf "total primer count = %d\n", $1}'; done
[!WARNING] Please provide pipeline parameters via the CLI or Nextflow
-params-fileoption. Custom config files including those provided by the-cNextflow option can be used to provide any configuration except for parameters; see docs.
Credits
uel3/t3pio was originally written by AJ Williams-Newkirk, S Lucking, R Jin, and adapted to nextflow by C Cole and R Jin.
We thank the following people for their extensive assistance in the development of this pipeline:
Contributions and Support
If you would like to contribute to this pipeline, please see the contributing guidelines.
Citations
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.
This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.
Owner
- Name: Candace Cole
- Login: uel3
- Kind: user
- Repositories: 1
- Profile: https://github.com/uel3
Microbiologist venturing into bioinformatics. My current interests include metagenomics, pathogen detection/discovery, and AMR.
Citation (CITATIONS.md)
# uel3/t3pio: Citations ## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/) > Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031. ## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/) > Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311. ## Pipeline tools - [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) > Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data [Online]. - [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/) > Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924. ## Software packaging/containerisation tools - [Anaconda](https://anaconda.com) > Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web. - [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/) > Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506. - [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/) > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671. - [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241) > Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241. - [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/) > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
GitHub Events
Total
- Create event: 6
- Issues event: 5
- Delete event: 5
- Issue comment event: 6
- Member event: 1
- Public event: 2
- Push event: 58
- Pull request review comment event: 1
- Pull request review event: 1
- Pull request event: 17
Last Year
- Create event: 6
- Issues event: 5
- Delete event: 5
- Issue comment event: 6
- Member event: 1
- Public event: 2
- Push event: 58
- Pull request review comment event: 1
- Pull request review event: 1
- Pull request event: 17
Dependencies
- mshick/add-pr-comment v2 composite
- actions/checkout v4 composite
- nf-core/setup-nextflow v1 composite
- actions/stale v9 composite
- actions/setup-python v5 composite
- eWaterCycle/setup-singularity v7 composite
- nf-core/setup-nextflow v1 composite
- actions/checkout b4ffde65f46336ab88eb53be808477a3936bae11 composite
- actions/setup-python 0a5c61591373683505ea898e09a3ea4f39ef2b9c composite
- peter-evans/create-or-update-comment 71345be0265236311c031f5c7866368bd1eff043 composite
- actions/checkout v4 composite
- actions/setup-python v5 composite
- actions/upload-artifact v4 composite
- nf-core/setup-nextflow v1 composite
- dawidd6/action-download-artifact v3 composite
- marocchino/sticky-pull-request-comment v2 composite
- actions/setup-python v5 composite
- rzr/fediverse-action master composite
- zentered/bluesky-post-action v0.1.0 composite