https://github.com/broadinstitute/vectreeid
Amplicon taxonomic identification pipeline for Neafsey Lab
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 2 committers (50.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.1%) to scientific vocabulary
Keywords
Repository
Amplicon taxonomic identification pipeline for Neafsey Lab
Basic Info
Statistics
- Stars: 1
- Watchers: 4
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
VecTreeID : Taxonomic Identification Pipeline
Created for Neafsey Lab @ Harvard School of Public Health
Maintained by Genomic Center for Infectious Diseases @ Broad Institute of MIT & Harvard
Contact: Jason Travis Mohabir (jmohabir@broadinstitute.org)
Public repository for the amplicon taxonomic identification in the Neafsey lab.
See the README in each corresponding pipeline for usage.
Installation
Install Anaconda3
The online documentation on how to install Anaconda 3 is given here: https://docs.anaconda.com/anaconda/install/linux/
Follow your Operating System specific instructions on how to install Anaconda3
Create conda environment for running the tool
Use the TaxonomyAssignmentPipeline.yml file to create a conda virtual environment
conda env create --file TaxonomyAssignmentPipeline.yml -p /path/to/env/<name-of-environment>/
To activate the conda environment
source activate <name-of-environment>
A detail description on creating a conda environment is given here: https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file
Important Note
Source: https://github.com/etetoolkit/ete/pull/636
The version of ete3 used is unable to parse the jplace output from EPA-ng. Users will need to manually update the 'newick.py' file in the conda environment after it has been initially created.
Replace the /lib/python3.10/site-packages/ete3/parser/newick.py file with the `newick.py' file provided.
The development team is working on this issue.
Arguments
```
\ \ / / |__ | |_ | _ \ \ \ / / | | __ ___ ___ | | | | | | \ \/ / _ \/ __| | '/ _ \/ _ \ | | | | | | \ / / (| | | | / _/| |_| || | \/ _|_||| _|_|__|__/
[Created on π Day 2023]
[Authors: Jason Travis Mohabir, Aina Zurita Martinez]
[Created for Neafsey Lab @ Harvard School of Public Health]
[Maintained by Genomic Center for Infectious Diseases @ Broad Institute of MIT & Harvard]
usage: VecTreeID: Taxonomy Assignment Pipeline for VectorSeq [-h] [--name NAME] --amplicon AMPLICON [--dada2directory DADA2DIRECTORY] [--workingdirectory WORKINGDIRECTORY] [--minasvreadcount MINASVREADCOUNT] [--minsamplereadcount MINSAMPLEREADCOUNT] [--maxtargetseq MAXTARGETSEQ] [--artefactcutoff ARTEFACTCUTOFF] [--mincoverage MINCOVERAGE] [--minidentity MINIDENTITY] [--lwrcutoff LWRCUTOFF] [--maxhaplotypespersample MAXHAPLOTYPESPERSAMPLE] [--minabundanceassignment MINABUNDANCEASSIGNMENT] [--tempdir TEMPDIR] [--referencetree REFERENCETREE] [--referencemsa REFERENCEMSA] [--referencedatabase REFERENCEDATABASE] [--blastonly] [--runblast] [--runmsa] [--runtree]
options: -h, --help show this help message and exit --name NAME name of batch --amplicon AMPLICON amplicon name --dada2directory DADA2DIRECTORY DADA2 directory with inputs --workingdirectory WORKINGDIRECTORY working directory --minasvreadcount MINASVREADCOUNT asv total read count threshold --minsamplereadcount MINSAMPLEREADCOUNT sample total read count threshold --maxtargetseq MAXTARGETSEQ blastn maxtargetseq --artefactcutoff ARTEFACTCUTOFF artefact filter (coverage & identity) --mincoverage MINCOVERAGE percent coverage filter --minidentity MINIDENTITY percent identity filter --lwrcutoff LWRCUTOFF Like Weight Ratio cutoff --maxhaplotypespersample MAXHAPLOTYPESPERSAMPLE maximum number of ASVs for batch-level --minabundanceassignment MINABUNDANCEASSIGNMENT minimum ASV read count abundance --tempdir TEMPDIR temporary directory --referencetree REFERENCETREE reference tree --referencemsa REFERENCEMSA reference msa --referencedatabase REFERENCEDATABASE reference BLAST database --blastonly only run blastn --runblast run blast --runmsa run msa --runtree run tree ```
(c) 2024 Broad Institute
Owner
- Name: Broad Institute
- Login: broadinstitute
- Kind: organization
- Location: Cambridge, MA
- Website: http://www.broadinstitute.org/
- Twitter: broadinstitute
- Repositories: 1,083
- Profile: https://github.com/broadinstitute
Broad Institute of MIT and Harvard
GitHub Events
Total
- Push event: 3
Last Year
- Push event: 3
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Jason Travis Mohabir | j****r@b****g | 25 |
| amzurita | a****3@g****m | 17 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 1 minute
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- JasonMohabir (1)