https://github.com/amkram/usher
Ultrafast Sample Placement on Existing Trees
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 11 DOI reference(s) in README -
✓Academic publication links
Links to: biorxiv.org, nature.com, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.9%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Ultrafast Sample Placement on Existing Trees
Basic Info
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of yatisht/usher
Created about 5 years ago
· Last pushed almost 4 years ago
https://github.com/amkram/usher/blob/master/
# Ultrafast Sample Placement on Existing Trees (UShER) [license-badge]: https://img.shields.io/badge/License-MIT-yellow.svg [license-link]: https://github.com/yatisht/usher/blob/master/LICENSE [![License][license-badge]][license-link] [](https://github.com/yatisht/usher/actions) [](http://bioconda.github.io/recipes/usher/README.html)[](https://usegalaxy.eu/root?tool_id=usher) [](https://www.nature.com/articles/s41588-021-00862-7) [](https://doi.org/10.1093/molbev/msab264) [](https://doi.org/10.1093/bioinformatics/btac401) [](https://zenodo.org/badge/latestdoi/296144053)**NEW: We will now be sharing and updating UShER's pre-processed mutation-annotated tree object for public SARS-CoV-2 sequences here: http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/UShER_SARS-CoV-2/. We recommend using https://cov2tree.org/ (developed by [Theo Sanderson](https://github.com/theosanderson/taxodium)) to visualize this tree.** UShER is now a package consisting of a family of programs for rapid phylogenetic analyses, particularly suitable for the SARS-CoV-2 genomes. * **UShER** is a program that rapidly places new samples onto an existing phylogeny using maximum parsimony. It is particularly helpful in understanding the relationships of newly sequenced SARS-CoV-2 genomes with each other and with previously sequenced genomes in a global phylogeny. This has emerged as an important challenge during the COVID-19 pandemic for enabling *genomic contact tracing* since the viral phylogeny is already very large (>2M sequences) and is expected to grow by many fold in the coming months. UShER is much faster than existing tools with similar functionality and has now also been integrated in the [UCSC SARS-CoV-2 Genome Browser](https://genome.ucsc.edu/cgi-bin/hgPhyloPlace), which does not require UShER installation and usage know-how as described below for SARS-CoV-2 applications. If you have sensitive data that cannot be shared over the Internet, consider using [ShUShER](https://shusher.gi.ucsc.edu/), developed by Alex Kramer (https://github.com/amkram/shusher), as an alternative to the Genome Browser. UShER uses the mutation-annotated tree (MAT) data format, which is a phylogenetic tree in which the branches are annotated with the mutations that have been inferred to have occurred on them. * **matUtils** is a toolkit for querying, interpreting and manipulating the mutation-annotated trees (MATs). Using matUtils, common operations in SARS-CoV-2 genomic surveillance and contact tracing efforts, including annotating a MAT with new clades, extracting subtrees of the most closely-related samples, or converting the MAT to standard Newick or VCF format can be performed in a matter of seconds to minutes even on a laptop. * **matOptimize** is a program to rapidly and effectively optimize a mutation-annotated tree (MAT) for parsimony using subtree pruning and regrafting (SPR) moves within a user-defined radius. * **RIPPLES** is a program that uses a phylogenomic technique to rapidly and sensitively detect recombinant nodes and their ancestors in a mutation-annotated tree (MAT). Please refer to our [wiki](https://usher-wiki.readthedocs.io/) for detailed instructions on installing and using the UShER package. ## Acknowledgement We thank Jim Kent and the UCSC Genome Browser team for allowing us to download the `faToVcf` utility (from http://hgdownload.soe.ucsc.edu/admin/exe/). Please read the license terms for `faToVcf` here: https://github.com/ucscGenomeBrowser/kent/blob/master/src/LICENSE. ## References **UShER:** * Yatish Turakhia, Bryan Thornlow, Angie S Hinrichs, Nicola de Maio, Landen Gozashti, Robert Lanfear, David Haussler, and Russ Corbett-Detig, "Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic", **Nature Genetics** (2021), [paper](https://t.co/ulGUSRmuWv?amp=1). **matUtils:** * Jakob McBroome*, Bryan Thornlow*, Angie S. Hinrichs, Alexander Kramer, Nicola De Maio, Nick Goldman, David Haussler, Russell Corbett-Detig, Yatish Turakhia, "A daily-updated database and tools for comprehensive SARS-CoV-2 mutation-annotated trees", **Molecular Biology and Evolution** (2021), [paper](https://doi.org/10.1093/molbev/msab264). **RIPPLES:** * Yatish Turakhia*, Bryan Thornlow*, Angie S. Hinrichs, Jakob McBroome, Nicolas Ayala, Cheng Ye, Nicola De Maio, David Haussler, Russell Corbett-Detig, "Pandemic-Scale Phylogenomics Reveals Elevated Recombination Rates in the SARS-CoV-2 Spike Region", bioRxiv (2021), [preprint](https://www.biorxiv.org/content/10.1101/2021.08.04.455157v1). **matOptimize** * Cheng Ye, Bryan Thornlow, Angie Hinrichs, Alexander Kramer, Cade Mirchandani, Devika Torvi, Robert Lanfear, Russell Corbett-Detig, Yatish Turakhia, "matOptimize: A parallel tree optimization method enables online phylogenetics for SARS-CoV-2", **Bioinformatics** (2022), [paper](https://doi.org/10.1093/bioinformatics/btac401). **For masking recomendations, please also cite:** * Yatish Turakhia*, Nicola De Maio*, Bryan Thornlow*, Landen Gozashti, Robert Lanfear, Conor R. Walker, Angie S. Hinrichs, Jason D. Fernandes, Rui Borges, Greg Slodkowicz, Lukas Weilguny, David Haussler, Nick Goldman and Russell Corbett-Detig, "Stability of SARS-CoV-2 Phylogenies", **PLOS Genetics** (2020), [paper](https://doi.org/10.1371/journal.pgen.1009175). * Landen Gozashti, Conor R. Walker, Robert Lanfear, Nick Goldman, Nicola De Maio and Russell Corbett-Detig, "Issues with SARS-CoV-2 sequencing data: Updated analysis with data from 4 March 2021", Virological 2021 (https://virological.org/t/issues-with-sars-cov-2-sequencing-data/473/15).
Owner
- Name: Alex Kramer
- Login: amkram
- Kind: user
- Location: Santa Cruz, CA
- Company: @corbett-lab
- Repositories: 45
- Profile: https://github.com/amkram
Graduate student at UC Santa Cruz - Biomolecular Engineering and Bioinformatics
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
**NEW: We will now be sharing and updating UShER's pre-processed mutation-annotated tree object for public SARS-CoV-2 sequences here: http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/UShER_SARS-CoV-2/. We recommend using https://cov2tree.org/ (developed by [Theo Sanderson](https://github.com/theosanderson/taxodium)) to visualize this tree.**
UShER is now a package consisting of a family of programs for rapid phylogenetic analyses, particularly suitable for the SARS-CoV-2 genomes.
* **UShER** is a program that rapidly places new samples onto an existing phylogeny using maximum parsimony. It is particularly helpful in understanding the relationships of newly sequenced SARS-CoV-2 genomes with each other and with previously sequenced genomes in a global phylogeny. This has emerged as an important challenge during the COVID-19 pandemic for enabling *genomic contact tracing* since the viral phylogeny is already very large (>2M sequences) and is expected to grow by many fold in the coming months. UShER is much faster than existing tools with similar functionality and has now also been integrated in the [UCSC SARS-CoV-2 Genome Browser](https://genome.ucsc.edu/cgi-bin/hgPhyloPlace), which does not require UShER installation and usage know-how as described below for SARS-CoV-2 applications. If you have sensitive data that cannot be shared over the Internet, consider using [ShUShER](https://shusher.gi.ucsc.edu/), developed by Alex Kramer (https://github.com/amkram/shusher), as an alternative to the Genome Browser. UShER uses the mutation-annotated tree (MAT) data format, which is a phylogenetic tree in which the branches are annotated with the mutations that have been inferred to have occurred on them.
* **matUtils** is a toolkit for querying, interpreting and manipulating the mutation-annotated trees (MATs). Using matUtils, common operations in SARS-CoV-2 genomic surveillance and contact tracing efforts, including annotating a MAT with new clades, extracting subtrees of the most closely-related samples, or converting the MAT to standard Newick or VCF format can be performed in a matter of seconds to minutes even on a laptop.
* **matOptimize** is a program to rapidly and effectively optimize a mutation-annotated tree (MAT) for parsimony using subtree pruning and regrafting (SPR) moves within a user-defined radius.
* **RIPPLES** is a program that uses a phylogenomic technique to rapidly and sensitively detect recombinant nodes and their ancestors in a mutation-annotated tree (MAT).
Please refer to our [wiki](https://usher-wiki.readthedocs.io/) for detailed instructions on installing and using the UShER package.
## Acknowledgement
We thank Jim Kent and the UCSC Genome Browser team for allowing us to download the `faToVcf` utility (from http://hgdownload.soe.ucsc.edu/admin/exe/). Please read the license terms for `faToVcf` here: https://github.com/ucscGenomeBrowser/kent/blob/master/src/LICENSE.
## References
**UShER:**
* Yatish Turakhia, Bryan Thornlow, Angie S Hinrichs, Nicola de Maio, Landen Gozashti, Robert Lanfear, David Haussler, and Russ Corbett-Detig, "Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic", **Nature Genetics** (2021), [paper](https://t.co/ulGUSRmuWv?amp=1).
**matUtils:**
* Jakob McBroome*, Bryan Thornlow*, Angie S. Hinrichs, Alexander Kramer, Nicola De Maio, Nick Goldman, David Haussler, Russell Corbett-Detig, Yatish Turakhia, "A daily-updated database and tools for comprehensive SARS-CoV-2 mutation-annotated trees", **Molecular Biology and Evolution** (2021), [paper](https://doi.org/10.1093/molbev/msab264).
**RIPPLES:**
* Yatish Turakhia*, Bryan Thornlow*, Angie S. Hinrichs, Jakob McBroome, Nicolas Ayala, Cheng Ye, Nicola De Maio, David Haussler, Russell Corbett-Detig, "Pandemic-Scale Phylogenomics Reveals Elevated Recombination Rates in the SARS-CoV-2 Spike Region", bioRxiv (2021), [preprint](https://www.biorxiv.org/content/10.1101/2021.08.04.455157v1).
**matOptimize**
* Cheng Ye, Bryan Thornlow, Angie Hinrichs, Alexander Kramer, Cade Mirchandani, Devika Torvi, Robert Lanfear, Russell Corbett-Detig, Yatish Turakhia, "matOptimize: A parallel tree optimization method enables online phylogenetics for SARS-CoV-2", **Bioinformatics** (2022), [paper](https://doi.org/10.1093/bioinformatics/btac401).
**For masking recomendations, please also cite:**
* Yatish Turakhia*, Nicola De Maio*, Bryan Thornlow*, Landen Gozashti, Robert Lanfear, Conor R. Walker, Angie S. Hinrichs, Jason D. Fernandes, Rui Borges, Greg Slodkowicz, Lukas Weilguny, David Haussler, Nick Goldman and Russell Corbett-Detig, "Stability of SARS-CoV-2 Phylogenies", **PLOS Genetics** (2020), [paper](https://doi.org/10.1371/journal.pgen.1009175).
* Landen Gozashti, Conor R. Walker, Robert Lanfear, Nick Goldman, Nicola De Maio and Russell Corbett-Detig, "Issues with SARS-CoV-2 sequencing data: Updated analysis with data from 4 March 2021", Virological 2021 (https://virological.org/t/issues-with-sars-cov-2-sequencing-data/473/15).