https://github.com/a-r-j/cpdb

Cython implementation of PDB -> DataFrame parsing

https://github.com/a-r-j/cpdb

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.4%) to scientific vocabulary

Keywords

bioinformatics deep-learning machine-learning pdb pdb-files pdb-parser protein protein-structure structural-biology

Keywords from Contributors

molecule interactome
Last synced: 6 months ago · JSON representation

Repository

Cython implementation of PDB -> DataFrame parsing

Basic Info
  • Host: GitHub
  • Owner: a-r-j
  • License: mit
  • Language: Cython
  • Default Branch: main
  • Homepage:
  • Size: 18.7 MB
Statistics
  • Stars: 23
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
bioinformatics deep-learning machine-learning pdb pdb-files pdb-parser protein protein-structure structural-biology
Created almost 3 years ago · Last pushed over 2 years ago
Metadata Files
Readme License

README.md

PyPI version License: MIT Code style: black

CPDB

Cython implementation of PDB -> DataFrame parsing

Installation

bash pip install cpdb-protein

Usage

To Dictionary

```python

To dictionary

from cpdb import parse

From Disk

data = parse("pathtopdb.pdb", df=False) data = parse("pathtopdb.pdb.gz", df=False)

From str

with open("tests/testdata/1htq.pdb") as f: pdbfile = f.readlines() data = parse(pdbstr=pdbfile, df=False)

From PDB

data = parse(pdb_code="3eiy", df=False)

From AF2

data = parse(uniprot_id="Q8W3K0", df=False) ```

{'record_name': array(['ATOM', 'ATOM', 'ATOM', ..., 'HETATM', 'HETATM', 'HETATM'], dtype=object), 'atom_number': array([ 1, 2, 3, ..., 1773, 1774, 1775], dtype=int32), 'atom_name': array(['N', 'CA', 'C', ..., 'O', 'O', 'O'], dtype=object), 'alt_loc': array(['', '', '', ..., '', '', ''], dtype=object), 'residue_name': array(['GLY', 'GLY', 'GLY', ..., 'HOH', 'HOH', 'HOH'], dtype=object), 'chain_id': array(['A', 'A', 'A', ..., 'A', 'A', 'A'], dtype=object), 'residue_number': array([ 30, 30, 30, ..., 2276, 2277, 2278], dtype=int32), 'insertion': array(['', '', '', ..., '', '', ''], dtype=object), 'x_coord': array([31.203, 32.02 , 33.358, ..., 44.665, 41.786, 38.498], dtype=float32), 'y_coord': array([26.31 , 27.046, 26.387, ..., 13.172, 10.059, 12.491], dtype=float32), 'z_coord': array([ 6.06 , 5.069, 4.79 , ..., 18.445, 22.316, 15.004], dtype=float32), 'occupancy': array([0.5, 0.5, 0.5, ..., 1. , 1. , 1. ], dtype=float32), 'b_factor': array([26.27, 29.29, 30.21, ..., 24.67, 34.64, 41.14], dtype=float32), 'element_symbol': array(['N', 'C', 'C', ..., 'O', 'O', 'O'], dtype=object), 'charge': array(['', '', '', ..., '', '', ''], dtype=object), 'model_idx': array([1, 1, 1, ..., 1, 1, 1], dtype=int32)}

To Pandas DataFrame

```python from cpdb import parse

From Disk

data = parse("pathtopdb.pdb", df=True) data = parse("pathtopdb.pdb.gz", df=True)

From str

with open("tests/testdata/1htq.pdb") as f: pdbfile = f.readlines() data = parse(pdbstr=pdbfile, df=True)

From PDB

data = parse(pdb_code="3eiy", df=True)

From AF2

data = parse(uniprot_id="Q8W3K0", df=True) ```

record_name atom_number atom_name alt_loc residue_name chain_id residue_number insertion x_coord y_coord z_coord occupancy b_factor element_symbol charge model_idx 0 ATOM 1 N GLY A 30 31.202999 26.309999 6.060000 0.50 26.270000 N 1 1 ATOM 2 CA GLY A 30 32.020000 27.046000 5.069000 0.50 29.290001 C 1 2 ATOM 3 C GLY A 30 33.358002 26.386999 4.790000 0.50 30.209999 C 1 3 ATOM 4 O GLY A 30 33.810001 25.535999 5.552000 0.50 29.299999 O 1 4 ATOM 5 N GLY A 31 33.987000 26.789000 3.684000 0.50 31.889999 N 1 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 1769 HETATM 1771 O HOH A 2274 42.688999 61.925999 29.589001 1.00 39.950001 O 1 1770 HETATM 1772 O HOH A 2275 32.055000 62.648998 30.961000 0.66 15.680000 O 1 1771 HETATM 1773 O HOH A 2276 44.665001 13.172000 18.445000 1.00 24.670000 O 1 1772 HETATM 1774 O HOH A 2277 41.785999 10.059000 22.316000 1.00 34.639999 O 1 1773 HETATM 1775 O HOH A 2278 38.498001 12.491000 15.004000 1.00 41.139999 O 1

Owner

  • Name: Arian Jamasb
  • Login: a-r-j
  • Kind: user
  • Location: Basel
  • Company: University of Cambridge

Principal ML Scientist @PrescientDesign / Tensor Jockey / PhD @ University of Cambridge Prev: MILA, Google X, Relation Therapeutic

GitHub Events

Total
  • Watch event: 11
Last Year
  • Watch event: 11

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 16
  • Total Committers: 2
  • Avg Commits per committer: 8.0
  • Development Distribution Score (DDS): 0.438
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Arian Jamasb a****b@r****m 9
Arian Jamasb a****b@g****m 7
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1
  • Total pull requests: 4
  • Average time to close issues: 4 days
  • Average time to close pull requests: 3 minutes
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 6.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • flyfyyfyy (1)
Pull Request Authors
  • a-r-j (4)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 1,440 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
pypi.org: cpdb-protein
  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,440 Last month
Rankings
Dependent packages count: 7.3%
Downloads: 15.9%
Stargazers count: 17.9%
Average: 22.6%
Forks count: 30.4%
Dependent repos count: 41.3%
Maintainers (1)
Last synced: 6 months ago