Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 3 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.4%) to scientific vocabulary
Last synced: 6 months ago
·
JSON representation
Repository
Synthetic Bayesian Classification
Basic Info
- Host: GitHub
- Owner: cthoyt
- License: gpl-3.0
- Default Branch: master
- Size: 71.7 MB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of lich-uct/syba
Created about 5 years ago
· Last pushed about 5 years ago
https://github.com/cthoyt/syba/blob/master/
# SYBA SYnthetic BAyesian classifier (SYBA) is a Python package for the classification of organic compounds as easy-to-synthesize (ES) or hard-to-synthesize (ES). SYBA is a fragment-based method. The molecule is decomposed into ECFP4-like fragments, a fragment score is assigned to each fragment and all fragment scores are summed up to give the resulting SYBA score. If SYBA score is positive, the molecule is considered to be ES, otherwise it is considered to be HS. Fragment scores are the part of the SYBA algorithm and they were obtained by the analysis of the frequency of fragments in the databases of ES and HS compounds. ES compounds were obtained by a random selection from the ZINC15 [http://zinc.docking.org/] database, HS compounds were generated by the Nonpher [https://github.com/lich-uct/nonpher] approach. More details can be found in SYBA [as soon as accepted] and Nonpher [http://dx.doi.org/10.1186/s13321-017-0206-2] papers. ## Instalation ### Prerequisities #### Supported platforms: * All platforms #### Dependencies * RDKit [https://github.com/rdkit/rdkit] (recommended version 2018_03_1 or later) ### Installation with Anaconda SYBA is distributed as a Conda package. Conda is an open source package management system and environment management system that makes setting up a development environment for any project very easy. To install Conda package, you have to get either full Anaconda [https://www.anaconda.com/] distribution or its lightweight variant, Miniconda [https://docs.conda.io/en/latest/miniconda.html]. SYBA is installed from Anaconda/Miniconda by running the following command from the Linux terminal: ```bash conda install -c rdkit -c lich syba ``` ### Installation with setup.py Once you have RDKit[https://github.com/rdkit/rdkit] installed, you can install SYBA from its directory with the following command: ```bash python setup.py install ``` ## Quick start SYBA input is a CSV (comma-separated value) file consisting of the following columns: CMPND_ID,SMILES,OTHER_COLUMNS. OTHER_COLUMNS can contain any additional data and these columns are skipped. Output is a CSV file in the format ID,SMILES,SYBA_SCORE. SYBA reflects how confident the classifier is with its prediction (i.e., SYBA score can't be considered as a measure of the ease of synthesis). Negative SYBA values mean a hard-to-synthesize compound and positive mean an easy-to-synthesize one. SYBA classification is performed by the following command: ```bash python -m syba.syba [INPUT_FILE [OUTPUT_FILE]] ``` ## Use in Python script ### Basic usage ```python from rdkit import Chem from syba.syba import SybaClassifier syba = SybaClassifier() syba.fitDefaultScore() smi = "O=C(C)Oc1ccccc1C(=O)O" syba.predict(smi) # syba works also with RDKit RDMol objects mol = Chem.MolFromSmiles(smi) syba.predict(mol=mol) # syba.predict is actually method with two keyword parameters "smi" and "mol", if both provided score is calculated for compound defined in "smi" parameter has the priority syba.predict(smi=smi, mol=mol) ``` ## SYBA workflow SYBA training (i.e., SYBA fragment score calculation) is demonstrated in Jupyter notebook accessible in `docs/notebooks/prepare_fragment_counts.ipynb`. The example of SYBA, as well as SAScore, SCScore and Random forest, classification for a new compound is available in `docs/notebooks/prepare_results.ipynb` Jupyter notebook. Jupyter notebook can be installed from Conda with the command `conda install jupyter`.
Owner
- Name: Charles Tapley Hoyt
- Login: cthoyt
- Kind: user
- Location: Bonn, Germany
- Company: RWTH Aachen University
- Website: https://cthoyt.com
- Repositories: 489
- Profile: https://github.com/cthoyt