https://github.com/anton-bushuiev/proteinttt

Training on test proteins improves fitness, structure, and function prediction

https://github.com/anton-bushuiev/proteinttt

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

Training on test proteins improves fitness, structure, and function prediction

Basic Info
  • Host: GitHub
  • Owner: anton-bushuiev
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 1.13 MB
Statistics
  • Stars: 11
  • Watchers: 5
  • Forks: 2
  • Open Issues: 1
  • Releases: 1
Created over 1 year ago · Last pushed 7 months ago
Metadata Files
Readme License

README.md

ProteinTTT

Example of TTT applied to protein folding

ProteinTTT is a package that allows you to use test-time training (TTT) to improve the performance of protein language models via on-the-fly per-protein customization.

🚨 The repository is under active development.

Installation

Please first install the model you are planning to use with TTT and then install ProteinTTT:

bash git clone https://github.com/anton-bushuiev/ProteinTTT && pip install -e ProteinTTT

Usage

In the following example, we use ESMFold + TTT to predict the structure of a protein. Here, customizing ESMFold with TTT leads to structure prediction with twice higher pLDDT.

```python import torch import esm import biotite.structure.io as bsio from proteinttt.models.esmfold import ESMFoldTTT, DEFAULTESMFOLDTTT_CFG

Set your sequence

sequence = "GIHLGELGLLPSTVLAIGYFENLVNIICESLNMLPKLEVSGKEYKKFKFTIVIPKDLDANIKKRAKIYFKQKSLIEIEIPTSSRNYPIHIQFDENSTDDILHLYDMPTTIGGIDKAIEMFMRKGHIGKTDQQKLLEERELRNFKTTLENLIATDAFAKEMVEVIIEE"

Load model

model = esm.pretrained.esmfold_v1() model = model.eval().cuda()

def predictstructure(model, sequence): with torch.nograd(): output = model.infer_pdb(sequence)

with open("result.pdb", "w") as f:
    f.write(output)

struct = bsio.load_structure("result.pdb", extra_fields=["b_factor"])
print('pLDDT:', struct.b_factor.mean())

predict_structure(model, sequence)

pLDDT: 38.43025

================ TTT ================

tttcfg = DEFAULTESMFOLDTTTCFG tttcfg.steps = 10 # This is how you can modify config model = ESMFoldTTT.tttfrompretrained(model, tttcfg=tttcfg, esmfoldconfig=model.cfg) model.ttt(sequence)

=====================================

predict_structure(model, sequence)

pLDDT: 78.69619

Reset model to original state (after this model.ttt can be called again on another protein)

================ TTT ================

model.ttt_reset()

=====================================

```

See notebooks/demo.ipynb for more usage examples.

References

If you use ProteinTTT in your research, please cite the following paper:

bibtex @article{bushuiev2024training, title={Training on test proteins improves fitness, structure, and function prediction}, author={Bushuiev, Anton and Bushuiev, Roman and Zadorozhny, Nikola and Samusevich, Raman and St{\"a}rk, Hannes and Sedlar, Jiri and Pluskal, Tom{\'a}{\v{s}} and Sivic, Josef}, journal={arXiv preprint arXiv:2411.02109}, url={https://arxiv.org/abs/2411.02109}, doi={10.48550/arXiv.2411.02109}, year={2024} }

Owner

  • Name: Anton Bushuiev
  • Login: anton-bushuiev
  • Kind: user
  • Location: Prague
  • Company: Czech Technical University in Prague

PhD student. Machine learning / computational biology 🤖🌱

GitHub Events

Total
  • Release event: 1
  • Watch event: 11
  • Member event: 1
  • Push event: 28
  • Pull request event: 8
  • Fork event: 2
  • Create event: 3
Last Year
  • Release event: 1
  • Watch event: 11
  • Member event: 1
  • Push event: 28
  • Pull request event: 8
  • Fork event: 2
  • Create event: 3

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 4
  • Average time to close issues: N/A
  • Average time to close pull requests: about 10 hours
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 4
  • Average time to close issues: N/A
  • Average time to close pull requests: about 10 hours
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • roman-bushuiev (3)
Top Labels
Issue Labels
Pull Request Labels