https://github.com/broadinstitute/pyro-cov
Pyro models of SARS-CoV-2 variants
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: medrxiv.org, science.org, zenodo.org -
✓Committers with academic emails
3 of 8 committers (37.5%) from academic institutions -
✓Institutional organization owner
Organization broadinstitute has institutional domain (www.broadinstitute.org) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Pyro models of SARS-CoV-2 variants
Basic Info
Statistics
- Stars: 78
- Watchers: 14
- Forks: 30
- Open Issues: 12
- Releases: 2
Topics
Metadata Files
README.md
Pyro models for SARS-CoV-2 analysis

Supporting material for the paper "Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness" (medRxiv). Figures and supplementary data for that paper are in the paper/ directory.
This is open source, but we are not intending to support code for use by outside groups. To use outputs of this model, we recommend ingesting the tables strains.tsv and mutations.tsv.
Reproducing
Install software
Clone this repository:
sh
git clone git@github.com:broadinstitute/pyro-cov
cd pyro-cov
Install this python package:
py
pip install -e .
Get access to GISAID data
Work with GISAID to get a data agreement.
Define the following environment variables:
GISAID_USERNAME
GISAID_PASSWORD
GISAID_FEED
For example my username is fritz and my gisaid feed is broad2.
Download data
This downloads data from GISAID and clones repos for other data sources.
sh
make update
Preprocess data
This takes under an hour.
Results are cached in the results/ directory, so re-running on newly pulled data should be able to re-use alignment and PANGOlineage classification work.
sh
make preprocess
Analyze data
sh
make analyze
Generate plots and tables
Plots and tables are generated by running various notebooks: - mutrans.py - mutrans_backtesting.py - mutrans_gene.ipynb
Citing
If you use this software or predictions in the paper directory please consider citing:
@article {Obermeyer2021.09.07.21263228,
author = {Obermeyer, Fritz and
Schaffner, Stephen F. and
Jankowiak, Martin and
Barkas, Nikolaos and
Pyle, Jesse D. and
Park, Daniel J. and
MacInnis, Bronwyn L. and
Luban, Jeremy and
Sabeti, Pardis C. and
Lemieux, Jacob E.},
title = {Analysis of 2.1 million SARS-CoV-2 genomes identifies mutations associated with transmissibility},
elocation-id = {2021.09.07.21263228},
year = {2021},
doi = {10.1101/2021.09.07.21263228},
publisher = {Cold Spring Harbor Laboratory Press},
URL = {https://www.medrxiv.org/content/early/2021/09/13/2021.09.07.21263228},
eprint = {https://www.medrxiv.org/content/early/2021/09/13/2021.09.07.21263228.full.pdf},
journal = {medRxiv}
}
Owner
- Name: Broad Institute
- Login: broadinstitute
- Kind: organization
- Location: Cambridge, MA
- Website: http://www.broadinstitute.org/
- Twitter: broadinstitute
- Repositories: 1,083
- Profile: https://github.com/broadinstitute
Broad Institute of MIT and Harvard
GitHub Events
Total
- Watch event: 1
- Fork event: 2
Last Year
- Watch event: 1
- Fork event: 2
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Fritz Obermeyer | f****r@g****m | 773 |
| barkasn | n****s@o****m | 53 |
| nikolas.barkas@outlook.com | n****s@b****g | 37 |
| martin jankowiak | j****k@g****m | 29 |
| bkotzen | b****n@m****u | 15 |
| jy17 | l****x@b****g | 7 |
| Cornelius Roemer | c****r@g****m | 2 |
| martin jankowiak | m****i@b****o | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 32
- Total pull requests: 77
- Average time to close issues: about 1 month
- Average time to close pull requests: 10 days
- Total issue authors: 19
- Total pull request authors: 6
- Average comments per issue: 3.63
- Average comments per pull request: 0.22
- Merged pull requests: 70
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 15
- Average time to close issues: N/A
- Average time to close pull requests: 5 minutes
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 15
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- smruti241 (3)
- fritzo (3)
- corneliusroemer (2)
- liamxg (2)
- Shicheng-Guo (2)
- omarcr (1)
- Sanyukta2001 (1)
- lingxuan85511 (1)
- lucadesabato (1)
- yzy990924 (1)
- cwhittaker1000 (1)
- sweety919 (1)
- ShaneSchroeder (1)
- liyupeng111 (1)
- joicy (1)
Pull Request Authors
- bkotzen (56)
- barkasn (5)
- fritzo (4)
- martinjankowiak (2)
- corneliusroemer (2)
- JacobLemieux (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- biopython >=1.54
- colorcet *
- geopy *
- gpytorch *
- mappy *
- protobuf *
- pyro-ppl >=1.7
- scikit-learn *
- tqdm *
- umap-learn *
- actions/checkout v2 composite
- actions/setup-python v2 composite