https://github.com/a-r-j/cpdb

Cython implementation of PDB -> DataFrame parsing

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.4%) to scientific vocabulary

Keywords

bioinformatics deep-learning machine-learning pdb pdb-files pdb-parser protein protein-structure structural-biology

Keywords from Contributors

molecule interactome

Last synced: 6 months ago · JSON representation

Repository

Cython implementation of PDB -> DataFrame parsing

Basic Info

Host: GitHub
Owner: a-r-j
License: mit
Language: Cython
Default Branch: main
Homepage:
Size: 18.7 MB

Statistics

Stars: 23
Watchers: 2
Forks: 0
Open Issues: 0
Releases: 0

Topics

bioinformatics deep-learning machine-learning pdb pdb-files pdb-parser protein protein-structure structural-biology

Created almost 3 years ago · Last pushed over 2 years ago

Metadata Files

Readme License

CPDB

Cython implementation of PDB -> DataFrame parsing

Installation

bash pip install cpdb-protein

Usage

To Dictionary

```python

To dictionary

from cpdb import parse

From Disk

data = parse("pathtopdb.pdb", df=False) data = parse("pathtopdb.pdb.gz", df=False)

From str

with open("tests/testdata/1htq.pdb") as f: pdbfile = f.readlines() data = parse(pdbstr=pdbfile, df=False)

From PDB

data = parse(pdb_code="3eiy", df=False)

From AF2

data = parse(uniprot_id="Q8W3K0", df=False) ```

{'record_name': array(['ATOM', 'ATOM', 'ATOM', ..., 'HETATM', 'HETATM', 'HETATM'], dtype=object), 'atom_number': array([ 1, 2, 3, ..., 1773, 1774, 1775], dtype=int32), 'atom_name': array(['N', 'CA', 'C', ..., 'O', 'O', 'O'], dtype=object), 'alt_loc': array(['', '', '', ..., '', '', ''], dtype=object), 'residue_name': array(['GLY', 'GLY', 'GLY', ..., 'HOH', 'HOH', 'HOH'], dtype=object), 'chain_id': array(['A', 'A', 'A', ..., 'A', 'A', 'A'], dtype=object), 'residue_number': array([ 30, 30, 30, ..., 2276, 2277, 2278], dtype=int32), 'insertion': array(['', '', '', ..., '', '', ''], dtype=object), 'x_coord': array([31.203, 32.02 , 33.358, ..., 44.665, 41.786, 38.498], dtype=float32), 'y_coord': array([26.31 , 27.046, 26.387, ..., 13.172, 10.059, 12.491], dtype=float32), 'z_coord': array([ 6.06 , 5.069, 4.79 , ..., 18.445, 22.316, 15.004], dtype=float32), 'occupancy': array([0.5, 0.5, 0.5, ..., 1. , 1. , 1. ], dtype=float32), 'b_factor': array([26.27, 29.29, 30.21, ..., 24.67, 34.64, 41.14], dtype=float32), 'element_symbol': array(['N', 'C', 'C', ..., 'O', 'O', 'O'], dtype=object), 'charge': array(['', '', '', ..., '', '', ''], dtype=object), 'model_idx': array([1, 1, 1, ..., 1, 1, 1], dtype=int32)}

To Pandas DataFrame

```python from cpdb import parse

From Disk

data = parse("pathtopdb.pdb", df=True) data = parse("pathtopdb.pdb.gz", df=True)

From str

with open("tests/testdata/1htq.pdb") as f: pdbfile = f.readlines() data = parse(pdbstr=pdbfile, df=True)

From PDB

data = parse(pdb_code="3eiy", df=True)

From AF2

data = parse(uniprot_id="Q8W3K0", df=True) ```

record_name atom_number atom_name alt_loc residue_name chain_id residue_number insertion x_coord y_coord z_coord occupancy b_factor element_symbol charge model_idx 0 ATOM 1 N GLY A 30 31.202999 26.309999 6.060000 0.50 26.270000 N 1 1 ATOM 2 CA GLY A 30 32.020000 27.046000 5.069000 0.50 29.290001 C 1 2 ATOM 3 C GLY A 30 33.358002 26.386999 4.790000 0.50 30.209999 C 1 3 ATOM 4 O GLY A 30 33.810001 25.535999 5.552000 0.50 29.299999 O 1 4 ATOM 5 N GLY A 31 33.987000 26.789000 3.684000 0.50 31.889999 N 1 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 1769 HETATM 1771 O HOH A 2274 42.688999 61.925999 29.589001 1.00 39.950001 O 1 1770 HETATM 1772 O HOH A 2275 32.055000 62.648998 30.961000 0.66 15.680000 O 1 1771 HETATM 1773 O HOH A 2276 44.665001 13.172000 18.445000 1.00 24.670000 O 1 1772 HETATM 1774 O HOH A 2277 41.785999 10.059000 22.316000 1.00 34.639999 O 1 1773 HETATM 1775 O HOH A 2278 38.498001 12.491000 15.004000 1.00 41.139999 O 1

Owner

Name: Arian Jamasb
Login: a-r-j
Kind: user
Location: Basel
Company: University of Cambridge

Website: jamasb.io
Twitter: arian_jamasb
Repositories: 32
Profile: https://github.com/a-r-j

Principal ML Scientist @PrescientDesign / Tensor Jockey / PhD @ University of Cambridge Prev: MILA, Google X, Relation Therapeutic

GitHub Events

Total

Watch event: 11

Last Year

Watch event: 11

Committers

Last synced: about 1 year ago

All Time

Total Commits: 16
Total Committers: 2
Avg Commits per committer: 8.0
Development Distribution Score (DDS): 0.438

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Arian Jamasb	a**b@r**m	9
Arian Jamasb	a**b@g**m	7

Committer Domains (Top 20 + Academic)

roche.com: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 1
Total pull requests: 4
Average time to close issues: 4 days
Average time to close pull requests: 3 minutes
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 6.0
Average comments per pull request: 0.0
Merged pull requests: 4
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

flyfyyfyy (1)

Pull Request Authors

a-r-j (4)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 1,440 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 3
Total maintainers: 1

pypi.org: cpdb-protein

Homepage: https://github.com/a-r-j/cpdb
Documentation: https://cpdb-protein.readthedocs.io/
License: MIT
Latest release: 0.2.0
published over 2 years ago

Versions: 3
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 1,440 Last month

Rankings

Dependent packages count: 7.3%

Downloads: 15.9%

Stargazers count: 17.9%

Average: 22.6%

Forks count: 30.4%

Dependent repos count: 41.3%

Maintainers (1)

arianj

Last synced: 6 months ago

https://github.com/a-r-j/cpdb

Science Score: 13.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

CPDB

Installation

Usage

To Dictionary

To dictionary

From Disk

From str

From PDB

From AF2

To Pandas DataFrame

From Disk

From str

From PDB

From AF2

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: cpdb-protein

Rankings

Maintainers (1)