nextflow-pvacseq
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 12 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: acg-team
- License: mit
- Language: Nextflow
- Default Branch: master
- Size: 18 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Introduction
nf-core/pvacseq is a bioinformatics pipeline that transforms MAF files to VCF, annotates them with VEP, and analyzes them with pVACseq to facilitate the investigation of neoantigens.
Introduction
The nf-core/pvacseq pipeline is designed for the prediction of neoantigens from genomic data using pVACseq. It supports inputs in both MAF and VCF formats.
Pipeline Summary
The pipeline performs the following steps:
Input Preprocessing:
- Accepts MAF or VCF files as input.
- Converts MAF to VCF (if required) using maf2vcf.
Variant Annotation:
- Annotates variants using VEP, configured to meet the requirements of pVACseq.
Loading HLA:
- Reads and processes HLA typing information from a user-provided CSV file.
pVACseq Setup:
- Configures and downloads MHC class I and II files required for pVACseq if not provided and configure pvacseq environment.
pVACseq Execution:
- Predicts neoantigens using pVACseq.
MultiQC:
- Aggregates pipeline results with MultiQC.
Usage
[!NOTE] If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow.
Required Inputs
To run the pipeline, you need the following inputs:
- Input Files:
- A directory containing
*.mafor*.vcffiles.
- A directory containing
- HLA Typing Information:
- A CSV file (
--hla_csv) with the following structure:Sample_ID,HLA_Types TCGA-G4-6310-01A-11D-1719-10,HLA-C05:01;HLA-C06:02;HLA-B45:01;HLA-A29:02;HLA-B44:02;HLA-A02:01
- A CSV file (
Reference Genome:
- A FASTA file (
--fasta).
- A FASTA file (
VEP Requirements (Optional):
- The pipeline uses VEP for variant annotation. You can provide the following files if available:
- VEP Cache (
--vep_cache): Directory containing pre-downloaded VEP cache files. - VEP Plugins (
--vep_plugins): Directory containing required VEP plugins.
- VEP Cache (
- If these files are not provided, the pipeline will download the required VEP cache and plugins automatically.
- The pipeline uses VEP for variant annotation. You can provide the following files if available:
pVACseq Requirements (Optional):
- The pipeline uses IEDB for neoantigen prediction. You can provide the following directory if available:
- IEDB Installation Directory (
--pvacseq_iedb): Directory containing MHC class I and/or II files for IEDB.
- IEDB Installation Directory (
- If this directory is not provided, the pipeline will download and configure IEDB automatically.
- The pipeline uses IEDB for neoantigen prediction. You can provide the following directory if available:
Running the Pipeline
You can run the pipeline using the following command:
bash
nextflow run main.nf \
-profile <docker/conda> \
--input <INPUT DIRECTORY> \
--hla_csv <HLA CSV FILE> \
--fasta <REFERENCE FASTA> \
--outdir <OUTPUT DIRECTORY>
Testing the Pipeline
If you want to test the pipeline, you need to download the reference genome from GDC Reference Files. Once downloaded, you can run the test profile as follows:
bash
nextflow run main.nf \
-profile test,<docker/conda> \
--outdir <OUTPUT DIRECTORY> \
--fasta <PATH TO FASTA>
Citations
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.
You can cite the nf-core publication as follows:
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.
Jasreet Hundal+, Susanna Kiwala+, Joshua McMichael, Christopher A Miller, Alexander T Wollam, Huiming Xia, Connor J Liu, Sidi Zhao, Yang-Yang Feng, Aaron P Graubert, Amber Z Wollam, Jonas Neichin, Megan Neveau, Jason Walker, William E Gillanders, Elaine R Mardis, Obi L Griffith, Malachi Griffith. pVACtools: a computational toolkit to select and visualize cancer neoantigens. Cancer Immunology Research. 2020 Mar;8(3):409-420. DOI: 10.1158/2326-6066.CIR-19-0401. PMID: 31907209. (+) equal contribution.
Jasreet Hundal, Susanna Kiwala, Yang-Yang Feng, Connor J. Liu, Ramaswamy Govindan, William C. Chapman, Ravindra Uppaluri, S. Joshua Swamidass, Obi L. Griffith, Elaine R. Mardis, and Malachi Griffith. Accounting for proximal variants improves neoantigen prediction. Nature Genetics. 2018, DOI: 10.1038/s41588-018-0283-9. PMID: 30510237.
Jasreet Hundal, Beatriz M. Carreno, Allegra A. Petti, Gerald P. Linette, Obi L. Griffith, Elaine R. Mardis, and Malachi Griffith. pVACseq: A genome-guided in silico approach to identifying tumor neoantigens. Genome Medicine. 2016, 8:11, DOI: 10.1186/s13073-016-0264-5. PMID: 26825632.
McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P Cunningham F. The Ensembl Variant Effect Predictor. Genome Biology Jun 6;17(1):122. (2016) doi:10.1186/s13059-016-0974-4
Cyriac Kandoth. mskcc/vcf2maf: vcf2maf v1.6. (2020). doi:10.5281/zenodo.593251
Owner
- Name: Applied Computational Genomics Team
- Login: acg-team
- Kind: organization
- Location: Wädenswil, Switzerland
- Website: https://www.zhaw.ch/de/lsfm/institute-zentren/ias/forschung/computational-genomics/
- Repositories: 29
- Profile: https://github.com/acg-team
Computational Genomics tools from Maria Anisimova and collaborators
Citation (CITATIONS.md)
# nf-core/pvacseq: Citations ## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/) > Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031. ## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/) > Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311. ## Pipeline tools - [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/) > Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924. - [pVACseq](https://pvactools.readthedocs.io/en/latest/index.html) > Jasreet Hundal+, Susanna Kiwala+, Joshua McMichael, Christopher A Miller, Alexander T Wollam, Huiming Xia, Connor J Liu, Sidi Zhao, Yang-Yang Feng, Aaron P Graubert, Amber Z Wollam, Jonas Neichin, Megan Neveau, Jason Walker, William E Gillanders, Elaine R Mardis, Obi L Griffith, Malachi Griffith. pVACtools: a computational toolkit to select and visualize cancer neoantigens. Cancer Immunology Research. 2020 Mar;8(3):409-420. DOI: 10.1158/2326-6066.CIR-19-0401. PMID: 31907209. (+) equal contribution. > Jasreet Hundal, Susanna Kiwala, Yang-Yang Feng, Connor J. Liu, Ramaswamy Govindan, William C. Chapman, Ravindra Uppaluri, S. Joshua Swamidass, Obi L. Griffith, Elaine R. Mardis, and Malachi Griffith. Accounting for proximal variants improves neoantigen prediction. Nature Genetics. 2018, DOI: 10.1038/s41588-018-0283-9. PMID: 30510237. > Jasreet Hundal, Beatriz M. Carreno, Allegra A. Petti, Gerald P. Linette, Obi L. Griffith, Elaine R. Mardis, and Malachi Griffith. pVACseq: A genome-guided in silico approach to identifying tumor neoantigens. Genome Medicine. 2016, 8:11, DOI: 10.1186/s13073-016-0264-5. PMID: 26825632. - [VEP](https://www.ensembl.org/info/docs/tools/vep/index.html) > McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P Cunningham F. The Ensembl Variant Effect Predictor. Genome Biology Jun 6;17(1):122. (2016) doi:10.1186/s13059-016-0974-4 - [vcf2maf](https://github.com/mskcc/vcf2maf) > Cyriac Kandoth. mskcc/vcf2maf: vcf2maf v1.6. (2020). doi:10.5281/zenodo.593251 ## Software packaging/containerisation tools - [Anaconda](https://anaconda.com) > Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web. - [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/) > Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506. - [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/) > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671. - [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241) > Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241. - [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/) > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
GitHub Events
Total
- Push event: 10
Last Year
- Push event: 10