Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: biorxiv.org, nature.com
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.3%) to scientific vocabulary
Last synced: 7 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: YaoYinYing
  • License: apache-2.0
  • Language: Python
  • Default Branch: master
  • Size: 230 MB
Statistics
  • Stars: 7
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 4 years ago · Last pushed almost 4 years ago
Metadata Files
Readme License Codemeta

README.md

FoldDock

This repository contains the simultaneous folding and docking protocol FoldDock.

The protocol has been developed on 216 heterodimeric complexes from Dockground and tested on 1481 heterodimeric complexes extracted from the PDB.

The protocol uses the recently published state-of-the-art end-to-end protein structure predictor AlphaFold2 to predict the structure of heterodimeric complexes. \ AlphaFold2 is available under the Apache License, Version 2.0 and so is FoldDock, which is a derivative thereof. \ The AlphaFold2 parameters are made available under the terms of the CC BY 4.0 license and have not been modified. \ \ You may not use these files except in compliance with the licenses.

The success rate of the final protocol is 63% on the test set. By analyzing the predicted interfaces, we are able to distinguish accurate models with an AUC of 0.94 on the test set. For more information on this pipeline and its performance see Improved prediction of protein-protein interactions using AlphaFold2 and extended multiple-sequence alignments

2FYU | DockQ=0.94

The entire 2FYU complex is depicted in gray and the two modeled chains C in green and D in blue. 2FYU chains C and D

Installation

This repository contains a patched version of AlphaFold2 while all needed packages are supplied running commands in a Singularity image. The only requirement for running FoldDock is therefore singularity, which can be installed by following: https://sylabs.io/guides/3.0/user-guide/quick_start.html

To obtain FoldDock do:

git clone https://gitlab.com/ElofssonLab/FoldDock.git

Then, to adjust any left requirement, run:

cd FoldDock bash setup.sh

Full Pipeline

The full procedure described in Improved prediction of protein-protein interactions using AlphaFold2 and extended multiple-sequence alignments is provided through the run_pipeline.sh script.

To launch it, we recommend to create a folder containing the target protein input file/s and run from inside that folder:

FOLDDOCK_PATH="absolute path to FoldDock directory on your system" bash $FOLDDOCK_PATH/run_pipeline.sh file1 file2

Running the script in this way accepts both files in fasta (sequence) or a3m (multiple sequence alignments) formats. Structure files can also be used in this script to derive input sequences and compare final models with real structures (this is NOT a way to provide templates in modelling). To do this, you need to specify input structures in .pdb or .cif formats and a chain id as follows:

FOLDDOCK_PATH="absolute path to FoldDock directory on your system" bash $FOLDDOCK_PATH/run_pipeline.sh file1 file2 chain_id1 chain_id2

To test that everything works properly you can run:

``` FOLDDOCKPATH="absolute path to FoldDock directory on your system" cd $FOLDDOCKPATH/test/

bash $FOLDDOCKPATH/runpipeline.sh 1ay7u1.fasta 1ay7u2.fasta ```

Substitute the last command with the following one to test the pipeline with a pre-generated MSA:

bash $FOLDDOCK_PATH/run_pipeline.sh 1ay7_A.a3m 1ay7_B.a3m

Launch the next one to test the pipeline with an input structure instead:

bash $FOLDDOCK_PATH/run_pipeline.sh 1ay7.pdb 1ay7.pdb A B

The full pipeline execution takes approximately 30 minutes on a setup with Intel i5 9600k 4.7Gh CPU and Nvidia RTX2080Super GPU

The predicted interface lDDT (plDDT) and the number of contacts in the predicted interface can be used to score modeled structures using a simple sigmoidal function. We create a continuous scoring function using these metrics:

pDockQ = L/{1+exp(-k(x-x0))} + b , \ where x = average interface plDDTlog(number of interface contacts) and L= 0.724 x0= 152.611 k= 0.052 and b= 0.018.\ \ To calculate the pDockQ score from a predicted complex run: ./src/pdockq.py and provide the pdbfile and .pkl file from AlphaFold2. \ The script score.sh contains an example of how to run the scoring using a predicted structure. Note that the pdbfile has to be rewritten to contain two chains (see score.sh).

bash score.sh \ \ Using pDockQ results in an average error of 0.1, which enables separation of acceptable models (DockQ0.23) with an AUC of 0.95.

scoring

IMPORTANT: - file1 and file2 names always need to end with .fasta or .a3m or .pdb or .cif - the same format must be used for both files - each sequence contained in fasta and MSA files need to be on a single line - running the pipeline with structure input will yield a DockQ score as well, after comparing the final models with the two input chains joined in the same structure. - the pipeline will yield several intermediate files in the working directory where it is launched so we do recommend to create a dedicate folder for each run. - This pipeline is not optimised for homodimers. Homodimers are favorable to run with unpaired alignments, using e.g. only the fused alignments. - Note that pDockQ is not calibrated for overlapping proteins, which can be observed when modelling some homomeric proteins.

Copyright 2021 Patrick Bryant, Gabriele Pozzati and Arne Elofsson

Licensed under the Apache License, Version 2.0 (the "License"); \ you may not use this file except in compliance with the License. \ You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

Owner

  • Name: Yinying Yao
  • Login: YaoYinYing
  • Kind: user
  • Location: Kunming, Yunnan, China
  • Company: Huazhong Agricultural University

🧩PhD student in 🧬protein design🐱🐶Cat&dog lover w/o pets🌈INFP☕️Wild barista(not too bad)💻Glue coder📸Random photographer🤖ChatGPT prompt speaker

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "license": "https://spdx.org/licenses/Apache-2.0",
  "codeRepository": "git+https://gitlab.com/ElofssonLab/FoldDock.git",
  "dateCreated": "2022-02-02",
  "datePublished": "2022-02-02",
  "dateModified": "2022-02-02",
  "issueTracker": "https://gitlab.com/ElofssonLab/FoldDock/issues/",
  "name": "FoldDock",
  "description": "his repository contains the simultaneous folding and docking protocol FoldDock.\nThe protocol has been developed on 216 heterodimeric complexes from Dockground (http://dockground.compbio.ku.edu/downloads/unbound/benchmark4.tar.bz2)\nand tested on 1481 heterodimeric complexes extracted from the PDB (https://www.nature.com/articles/s41467-021-21636-z).\nThe protocol uses the recently published state-of-the-art end-to-end protein structure predictor AlphaFold2 (https://github.com/deepmind/alphafold) to predict the structure of heterodimeric complexes.\nThe success rate of the final protocol is 63% on the test set. By analyzing the predicted interfaces, we are able to distinguish accurate models with an AUC of 0.94 on the test set. For more information on this pipeline and its performance see\nImproved prediction of protein-protein interactions using AlphaFold2 and extended multiple-sequence alignments"
}

GitHub Events

Total
Last Year

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 3
  • Total Committers: 1
  • Avg Commits per committer: 3.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
yaoyy.hi@gmail.com 3****g 3

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels