https://github.com/brianhie/efficient-evolution

Efficient evolution from protein language models

https://github.com/brianhie/efficient-evolution

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: nature.com, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Efficient evolution from protein language models

Basic Info
  • Host: GitHub
  • Owner: brianhie
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 80.1 KB
Statistics
  • Stars: 138
  • Watchers: 6
  • Forks: 38
  • Open Issues: 6
  • Releases: 0
Created about 4 years ago · Last pushed almost 3 years ago
Metadata Files
Readme License

README.md

Efficient evolution from general protein language models

Scripts for running the analysis described in the paper "Efficient evolution of human antibodies from general protein language models".

Running the model

To evaluate the model on a new sequence, clone this repository and run bash python bin/recommend.py [sequence] where [sequence] is the wildtype protein sequence you want to evolve. The script will output a list of substitutions and the number of recommending language models.

To recommend mutations to antibody variable domain sequences, we have simply run the above script separately on the heavy and light chain sequences.

We have also made a Google Colab notebook available. However, this notebook requires a full download and installation of the language models for each run and requires Colab Pro instances with a higher memory requirement than the free version of Colab. When making many predictions, we recommend the local installation above, as this will allow you to cache and reuse the models.

Paper analysis scripts

To reproduce the analysis in the paper, first download and extract data with the commands: bash wget https://zenodo.org/record/6968342/files/data.tar.gz tar xvf data.tar.gz

To acquire mutations to a given antibody, run the command bash bash bin/eval_models.sh [antibody_name] where [antibody_name] is one of medi8852, medi_uca, mab114, mab114_uca, s309, regn10987, or c143.

DMS experiments can be run with the command bash bash bin/dms.sh

Owner

  • Name: Brian Hie
  • Login: brianhie
  • Kind: user
  • Location: San Francisco

GitHub Events

Total
  • Watch event: 39
  • Issue comment event: 1
  • Fork event: 9
Last Year
  • Watch event: 39
  • Issue comment event: 1
  • Fork event: 9