https://github.com/broadinstitute/vectreeid

Amplicon taxonomic identification pipeline for Neafsey Lab

https://github.com/broadinstitute/vectreeid

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 2 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.1%) to scientific vocabulary

Keywords

phylogenetics python taxonomy
Last synced: 6 months ago · JSON representation

Repository

Amplicon taxonomic identification pipeline for Neafsey Lab

Basic Info
  • Host: GitHub
  • Owner: broadinstitute
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 26.8 MB
Statistics
  • Stars: 1
  • Watchers: 4
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
phylogenetics python taxonomy
Created over 4 years ago · Last pushed about 1 year ago
Metadata Files
Readme

README.md

VecTreeID : Taxonomic Identification Pipeline

Created for Neafsey Lab @ Harvard School of Public Health
Maintained by Genomic Center for Infectious Diseases @ Broad Institute of MIT & Harvard

Contact: Jason Travis Mohabir (jmohabir@broadinstitute.org)

Public repository for the amplicon taxonomic identification in the Neafsey lab.

See the README in each corresponding pipeline for usage.

Installation

Install Anaconda3

The online documentation on how to install Anaconda 3 is given here: https://docs.anaconda.com/anaconda/install/linux/

Follow your Operating System specific instructions on how to install Anaconda3

Create conda environment for running the tool

Use the TaxonomyAssignmentPipeline.yml file to create a conda virtual environment

conda env create --file TaxonomyAssignmentPipeline.yml -p /path/to/env/<name-of-environment>/ To activate the conda environment source activate <name-of-environment> A detail description on creating a conda environment is given here: https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file

Important Note

Source: https://github.com/etetoolkit/ete/pull/636

The version of ete3 used is unable to parse the jplace output from EPA-ng. Users will need to manually update the 'newick.py' file in the conda environment after it has been initially created.

Replace the /lib/python3.10/site-packages/ete3/parser/newick.py file with the `newick.py' file provided.

The development team is working on this issue.

Arguments

```


\ \ / / |__ | |_ | _ \ \ \ / / | | __ ___ ___ | | | | | | \ \/ / _ \/ __| | '/ _ \/ _ \ | | | | | | \ / / (| | | | / _/| |_| || | \/ _|_||| _|_|__|__/

[Created on π Day 2023]
[Authors: Jason Travis Mohabir, Aina Zurita Martinez]
[Created for Neafsey Lab @ Harvard School of Public Health]
[Maintained by Genomic Center for Infectious Diseases @ Broad Institute of MIT & Harvard]

usage: VecTreeID: Taxonomy Assignment Pipeline for VectorSeq [-h] [--name NAME] --amplicon AMPLICON [--dada2directory DADA2DIRECTORY] [--workingdirectory WORKINGDIRECTORY] [--minasvreadcount MINASVREADCOUNT] [--minsamplereadcount MINSAMPLEREADCOUNT] [--maxtargetseq MAXTARGETSEQ] [--artefactcutoff ARTEFACTCUTOFF] [--mincoverage MINCOVERAGE] [--minidentity MINIDENTITY] [--lwrcutoff LWRCUTOFF] [--maxhaplotypespersample MAXHAPLOTYPESPERSAMPLE] [--minabundanceassignment MINABUNDANCEASSIGNMENT] [--tempdir TEMPDIR] [--referencetree REFERENCETREE] [--referencemsa REFERENCEMSA] [--referencedatabase REFERENCEDATABASE] [--blastonly] [--runblast] [--runmsa] [--runtree]

options: -h, --help show this help message and exit --name NAME name of batch --amplicon AMPLICON amplicon name --dada2directory DADA2DIRECTORY DADA2 directory with inputs --workingdirectory WORKINGDIRECTORY working directory --minasvreadcount MINASVREADCOUNT asv total read count threshold --minsamplereadcount MINSAMPLEREADCOUNT sample total read count threshold --maxtargetseq MAXTARGETSEQ blastn maxtargetseq --artefactcutoff ARTEFACTCUTOFF artefact filter (coverage & identity) --mincoverage MINCOVERAGE percent coverage filter --minidentity MINIDENTITY percent identity filter --lwrcutoff LWRCUTOFF Like Weight Ratio cutoff --maxhaplotypespersample MAXHAPLOTYPESPERSAMPLE maximum number of ASVs for batch-level --minabundanceassignment MINABUNDANCEASSIGNMENT minimum ASV read count abundance --tempdir TEMPDIR temporary directory --referencetree REFERENCETREE reference tree --referencemsa REFERENCEMSA reference msa --referencedatabase REFERENCEDATABASE reference BLAST database --blastonly only run blastn --runblast run blast --runmsa run msa --runtree run tree ```

(c) 2024 Broad Institute

Owner

  • Name: Broad Institute
  • Login: broadinstitute
  • Kind: organization
  • Location: Cambridge, MA

Broad Institute of MIT and Harvard

GitHub Events

Total
  • Push event: 3
Last Year
  • Push event: 3

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 42
  • Total Committers: 2
  • Avg Commits per committer: 21.0
  • Development Distribution Score (DDS): 0.405
Past Year
  • Commits: 11
  • Committers: 1
  • Avg Commits per committer: 11.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Jason Travis Mohabir j****r@b****g 25
amzurita a****3@g****m 17
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 minute
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • JasonMohabir (1)
Top Labels
Issue Labels
Pull Request Labels