quantumpdb

Workflow for generate a database of proteins with quantum properties

https://github.com/davidkastner/quantumpdb

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.8%) to scientific vocabulary

Keywords

enzymes machine-learning quantum-chemistry
Last synced: 6 months ago · JSON representation ·

Repository

Workflow for generate a database of proteins with quantum properties

Basic Info
Statistics
  • Stars: 10
  • Watchers: 1
  • Forks: 3
  • Open Issues: 1
  • Releases: 0
Topics
enzymes machine-learning quantum-chemistry
Created over 3 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

Graphical Summary of README

QuantumPDB

Table of Contents

  1. Overview
  2. Installation
    • Download the package
    • Creating python environment
    • Command-line interface
  3. What is included?
    • File structure
  4. Documentation
    • Read the Docs
    • Examples
  5. Developer Guide
    • GitHub refresher
  6. Areas of Active Development
  7. QuantumPDB File Structure

1. Overview

The purpose of quantumPDB (qp) is to serve as a toolkit for working with our database of proteins to setup and facilitate DFT-calculation and cluster model creation.

Software Diagram

2. Installation

Install the package by running the follow commands inside the repository. This will perform a developmental version install. It is good practice to do this inside of a virtual environment. A yaml environmental file has been created to facilitate the installation of dependencies.

Setup developing environment

To begin working with quantumPDB, first clone the repo and then move into the top-level directory of the package. Then perform a developer install. Remember to update your GitHub ssh keys. bash git clone git@github.com:davidkastner/quantumPDB.git

Creating python environment

All the dependencies can be loaded together using the prebuilt environment.yml file. Compatibility is automatically tested for python versions 3.8 and higher. If you are only going to be using the package run: bash cd quantumPDB conda env create -f environment.yml source activate qp python -m pip install -e .

Command-line interface

All of the functionality of quantumPDB has been organized into a command-line interface (CLI). After performing the developer install, the CLI can be called from anywhere using qp.

3. What is included?

Package architecture

. ├── docs # Readthedocs documentation site └── qp # quantumPDB subpackages and modules |── cli.py # Command-line interface entry point ├── checks # Perform quality and structural checks │ ├── fetch_pdb # Get a PDB │ └── to_xyz # Save structure to a new XYZ ├── structure # Correct the PDB structure │ ├── missing_loops # Use modeller to add missing loops │ └── add_hydrogens # Get the structure with hydrogens ├── clusters # Generalizable plotting and vizualization │ └── coordination_spheres # Select the first, second, etc. spheres └── manager ├── failure_checkup ├── find_incomplete └── job_manager

4. Documentation

Run the following commands to update the ReadTheDocs site

bash make clean make html

5. Developer guide

GitHub refresher

Push new changes

Use this Git sequence to make a quick push.

git status git pull git add -A . git commit -m "Change a specific functionality" git push -u origin main

Making a pull request

Use this Git sequence to make a branch and make a pull request. Recommend for significant changes.

``` git checkout main git pull

Before you begin making changes, create a new branch

git checkout -b new-feature-branch git add -A git commit -m "Detailed commit message describing the changes" git push -u origin new-feature-branch

Visit github.com to add description, submit, merge the pull request

Once finished on github.com, return to local

git checkout main git pull

Delete the remote branch

git branch -d new-feature-branch ```

Handle merge conflict

git stash push --include-untracked git stash drop git pull

6. Areas of active development

Currently working on handling all edge cases, including non-canonical amino acids. Additionally, support for mmCIFs will eventually needed to be added to work with newer and larger PDBs. Documentation is present for all functions in the code, but should be added with external examples for use.

7. QuantumPDB generated file structure

An example file structure from a qp run given the parameters in config.yaml input: qp_input.csv and output_dir: dataset/. If we run qp run -c ./config.yaml, then the file structure will be generated in dataset if qp_input.csv specfies 1a9s. . ├── config.yaml # The input yaml containing all `qp` job parameters ├── qp_input.csv # List of PDB ID's to run `qp` └── dataset # Specified with "output_dir: dataset/" in config.yaml └── 1a9s # One of these is generated for each entry in qp_input.csv ├── 1a9s_modeller.pdb # Modeller optimized structure with added missing atoms and loops ├── 1a9s.ali # Alignment file needed by Modeller ├── 1a9s.pdb # Original 1a9s PDB downloaded directly from the Protein Data Bank ├── charge.csv # Generated file used to calculate the charge of the system ├── count.csv # Generated file that keeps track of the residue counts ├── Protoss # Directory storing all Protoss files │ ├── 1a9s_ligands.sdf # Ligand structural files │ ├── 1a9s_log.txt # Error log from Protoss server │ └── 1a9s_protoss.pdb # Protonated result returned from the Protoss server └── A290 # Generated for each cluster, named by chain (A) and the res number of center (290) ├── 0.pdb # Structure of the center residue ├── 1.pdb # Structure of the first sphere around the center ├── 2.pdb # Structure of the second sphere around the first ├── A290.pdb # Structure of the entire cluster in PDB format ├── A290.xyz # Structure of the entire cluster in XYZ format └── wpbeh # QM job file specified with "method: wpbeh" in "config.yaml" ├── A290.xyz # Structure of the entire cluster in XYZ format ├── jobscript.sh # SLURM/SGE submit script ├── ptchrges.xyz # MM embedded point charges specified with "charge_embedding: true" ├── qmscript.in # QM job input using TeraChem ├── qmscript.out # QM job output details └── wpbeh # Results from the TeraChem QM calculation ├── A290.basis ├── A290.geometry ├── A290.molden ├── bond_order.list ├── c0 ├── charge_mull.xls ├── grad.xyz ├── mullpop ├── results.dat └── xyz.xyz

Copyright

Copyright (c) 2024, Kulik Group MIT

Acknowledgements

Project based on the Computational Molecular Science Python Cookiecutter version 1.1.

Owner

  • Name: David W. Kastner
  • Login: davidkastner
  • Kind: user
  • Location: MIT Cambridge Massachusetts
  • Company: Massachusetts Institute of Technology

MIT PhD Candidate • Bioengineering

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "QuantumPDB: Database and Automated Electronic Structure Predictions for Proteins"
version: 1.0.0
date-released: 2024-03-15
authors:
  - family-names: "Kastner"
    given-names: "David W."
    orcid: "https://orcid.org/0000-0002-7766-4249"
    affiliation: "Massachusetts Institute of Technology"
  - family-names: "Ho"
    given-names: "Wilson"
    affiliation: "Massachusetts Institute of Technology"
  - family-names: "Luo"
    given-names: "Weiliang"
    affiliation: "Massachusetts Institute of Technology"
  - family-names: "Reinhardt"
    given-names: "Clorice"
    affiliation: "Massachusetts Institute of Technology"
abstract: "A database and workflow for DFT electronic structure predictions for proteins."
keywords:
  - "computational chemistry"
  - "protein structure"
repository-code: "https://github.com/davidkastner/quantumPDB"
license: "MIT"

GitHub Events

Total
  • Watch event: 3
  • Delete event: 9
  • Push event: 28
  • Pull request event: 17
  • Fork event: 1
  • Create event: 12
Last Year
  • Watch event: 3
  • Delete event: 9
  • Push event: 28
  • Pull request event: 17
  • Fork event: 1
  • Create event: 12

Dependencies

.github/workflows/CI.yaml actions
  • actions/checkout v3 composite
  • codecov/codecov-action v1 composite
  • mamba-org/provision-with-micromamba main composite
pyproject.toml pypi
environment.yml pypi