https://github.com/cthoyt/biogrid-embeddings
Learned embeddings for proteins using physical interactions in BioGRID
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.8%) to scientific vocabulary
Repository
Learned embeddings for proteins using physical interactions in BioGRID
Basic Info
Statistics
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
biogrid-embeddings
This repository generates node2vec embeddings for physical interactions between human proteins and complexes in BioGRID.
🚀 Usage
Installation of the requirements and running of the build script are handled with tox. The current
version of BioGRID is looked up with bioversions so the
provenance of the data can be properly traced. Run with:
shell
$ pip install tox
$ tox
The embedding dataframe can be loaded from GitHub with:
```python import pandas as pd
url = "https://github.com/cthoyt/biogrid-embeddings/raw/main/output/4.4.200/embeddings.tsv" df = pd.readcsv(url, sep="\t", skiprows=1, indexcol=0, header=None) ```
The index uses UniProt protein identifiers. It skips a line since this TSV file uses the word2vec format, where the first line says the length and width of the file.
Here's a 2D PCA scatterplot of the embeddings:

⚖️ License
Code in this repository is licensed under the MIT License.
🙏 Acknowledgements
BioGRID can be cited with:
bibtex
@article{Oughtred2021,
author = {Oughtred, Rose and Rust, Jennifer and Chang, Christie and Breitkreutz, Bobby-Joe and Stark, Chris and Willems, Andrew and Boucher, Lorrie and Leung, Genie and Kolas, Nadine and Zhang, Frederick and Dolma, Sonam and Coulombe-Huntington, Jasmin and Chatr-Aryamontri, Andrew and Dolinski, Kara and Tyers, Mike},
doi = {10.1002/pro.3978},
journal = {Protein science : a publication of the Protein Society},
keywords = {COVID-19,CRISPR screen,biological network,chemical interaction,drug target,genetic interaction,phenotype,post-translational modification,protein interaction,ubiquitin-proteasome system},
number = {1},
pages = {187--200},
pmid = {33070389},
title = {{The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions.}},
volume = {30},
year = {2021}
}
Owner
- Name: Charles Tapley Hoyt
- Login: cthoyt
- Kind: user
- Location: Bonn, Germany
- Company: RWTH Aachen University
- Website: https://cthoyt.com
- Repositories: 489
- Profile: https://github.com/cthoyt
GitHub Events
Total
Last Year
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Charles Tapley Hoyt | c****t@g****m | 14 |
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0