https://github.com/cthoyt/ggn-go
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.1%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: cthoyt
- License: mit
- Default Branch: main
- Size: 378 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of MiJia-ID/GGN-GO
Created almost 2 years ago
· Last pushed almost 2 years ago
https://github.com/cthoyt/GGN-GO/blob/main/
# GGN-GO: A Geometric Graph Network Method for Protein Function Prediction Using Multi-scale Structural Features
[(https://zenodo.org/doi/10.5281/zenodo.13768952)]
## Installation
Clone the current repo
git clone https://github.com/MiJia-ID/GGN-GO.git
conda env create -f environment.yml
pip install . or pip install -r requirements.txt
You also need to install the relative packages to run ESM2 and ProtTrans protein language model. \
ESM2 model weight we use can be downloaded [here](https://dl.fbaipublicfiles.com/fair-esm/models/esm1b_t33_650M_UR50S.pt).
ProtTrans model weight we use can be downloaded [here](https://github.com/agemagician/ProtTrans)
## Model training
## Run GGN-GO for prediction
Simply run:
```
python predict.py --fasta_file ../Predict/6GEN-G.fasta
```
And the prediction results will be saved in
```
../Predict/
```
The default model parameters are trained on the combination of PDBch and AFch training set, e.g., `model_bp.pt`, `model_cc.pt` and `model_mf.pt`.\
You can also use the model parameters which are only trained on the PDBch training set, e.g., `model_bp_pdb.pt`, `model_cc_pdb.pt` and `model_mf_pdb.pt`.
### output
```txt
### Predictions made by GGN-GO.
Protein GO_term Score GO_term_name
6GEN-G GO:0003677 0.96044 DNA binding
6GEN-G GO:0046982 0.95577 protein heterodimerization activity
6GEN-G GO:0046983 0.93472 protein dimerization activity
```
## Dataset and model
We provide the datasets here for those interested in reproducing our paper. The datasets used in this study are stored in ```../Data/```.
The trained GGN_GO models can be found under ```../Model/```.
Owner
- Name: Charles Tapley Hoyt
- Login: cthoyt
- Kind: user
- Location: Bonn, Germany
- Company: RWTH Aachen University
- Website: https://cthoyt.com
- Repositories: 489
- Profile: https://github.com/cthoyt