https://github.com/aspuru-guzik-group/stereogeneration

Testing generation of molecules with stereoisomers.

https://github.com/aspuru-guzik-group/stereogeneration

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.2%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Testing generation of molecules with stereoisomers.

Basic Info
  • Host: GitHub
  • Owner: aspuru-guzik-group
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 150 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 4 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

Stereogeneration

Studying the effects of including stereisomeric information in generative models for molecules in optimizing stereochemistry-sensitive properties. We perform optimization on (1) rediscovery of R-albuterol and mestranol, (2) protein-ligand docking, and a stereochemistry-specific (3) CD peak spectra score.

Preprint found on ChemRxiv: Stereochemistry-aware string-based molecular generation. Data files are found on Zenodo

Getting started

Initialize a python environment, here we use conda, and install the required packages. ```bash git clone git@github.com:aspuru-guzik-group/stereogeneration.git cd stereogeneration

conda create -n stereogeneration python=3.8 conda activate stereogeneration pip install -r requirements.txt ```

Use of XTB

XTB will be installed in the requirements.txt files. Otherwise, you can install from source from xtb from the Grimme Lab. You can also install using conda. Use the following environment variables: bash export MKL_NUM_THREADS=1 export OMP_NUM_THREADS=1,1 export OMP_STACKSIZE=4G ulimit -s unlimited

CD spectra setup

Use of CD spectra task will require stda and xtb4stda from the Grimme Lab. The binary files are found in the stereogeneration/stda directory. The files will have to be made executable, and added to the $PATH variable: ```bash cd stereogeneration/stda chmod +x gspec stdav1.6.3 xtb4stda

set file paths which will be used by stda

export PATH=$PATH:$PWD export XTB4STDAHOME=$PWD ```

Docking setup

Docking requires executable of the smina binary: bash chmod +x stereogeneration/docking/smina.static

Running the models

Scripts (main.py) for running each model are found in the respective folders: reinvent, janus, group-janus. The scripts have commandline arguments that control the fitness function task, and some of the parameters of the models.

bash python main.py \ --target={1SYH, 1OYT, 6Y2F, cd, fp-albuterol, fp-mestranol} \ # specify task --stereo # turn on stereo-awareness

Analysis of results

The experiments were repeated 10 times for each model each task. The result files are found in Zenodo. The individual runs for each task are saved in folders {i}_stereo and {i}_nonstereo for $i \in {0,...,9}$. The figures and statistics were generated using the analysis_all.py, which also requires the zinc.csv file (available in Zenodo) to be located in the repo directory:

bash python analysis_all.py \ --target={1SYH, 1OYT, 6Y2F, cd, fp-albuterol, fp-mestranol} --root_dir='.' # where the dataset and `stereogeneration` import are found --label='1SYH' # name for target property label (defaults to 1SYH) --horizontal # toggles horizontal subplots, exclude for vertical subplots

Owner

  • Name: Aspuru-Guzik group repo
  • Login: aspuru-guzik-group
  • Kind: organization

GitHub Events

Total
  • Watch event: 1
  • Push event: 4
  • Public event: 1
Last Year
  • Watch event: 1
  • Push event: 4
  • Public event: 1

Dependencies

requirements.txt pypi
  • Cython ==0.29.27
  • Jinja2 ==3.0.3
  • Markdown ==3.4.1
  • MarkupSafe ==2.1.1
  • Pebble ==5.0.6
  • Pillow ==9.0.1
  • Pint ==0.0.0
  • PyNaCl ==1.5.0
  • PyYAML ==6.0
  • Pygments ==2.11.2
  • Send2Trash ==1.8.0
  • Werkzeug ==2.2.1
  • absl_py ==1.2.0
  • aiohttp ==3.8.1
  • aiosignal ==1.2.0
  • arff ==0.9
  • argon2_cffi ==21.3.0
  • argon2_cffi_bindings ==21.2.0
  • ase ==3.22.1
  • async_generator ==1.10
  • async_timeout ==4.0.2
  • attrs ==21.4.0
  • backcall ==0.2.0
  • backports.shutil_get_terminal_size ==1.0.0
  • backports_abc ==0.5
  • bcrypt ==3.2.0
  • bitstring ==3.1.9
  • bleach ==4.1.0
  • cachetools ==5.2.0
  • certifi ==2021.10.8
  • cffi ==1.15.0
  • chardet ==4.0.0
  • charset_normalizer ==2.0.11
  • cryptography ==36.0.1
  • cycler ==0.11.0
  • deap ==1.0.1
  • debugpy ==1.5.1
  • decorator ==5.1.1
  • defusedxml ==0.7.1
  • dill ==0.3.6
  • dnspython ==2.2.0
  • ecdsa ==0.17.0
  • entrypoints ==0.4
  • fire ==0.4.0
  • fonttools ==4.29.1
  • frozenlist ==1.3.0
  • fsspec ==2022.7.1
  • funcsigs ==1.0.2
  • global-chem ==1.8
  • google_auth ==2.9.1
  • google_auth_oauthlib ==0.4.6
  • grpcio ==1.47.0
  • idna ==3.3
  • importlib_metadata ==4.10.1
  • importlib_resources ==5.4.0
  • ipykernel ==6.0.3
  • ipython ==7.31.1
  • ipython_genutils ==0.2.0
  • ipywidgets ==7.6.5
  • jedi ==0.18.1
  • joblib ==1.1.0
  • jsonschema ==4.4.0
  • jupyter-client ==6.1.12
  • jupyter_core ==4.9.1
  • jupyterlab_pygments ==0.1.2
  • jupyterlab_widgets ==1.0.2
  • kiwisolver ==1.3.2
  • lockfile ==0.12.2
  • mapchiral ==0.0.5
  • matplotlib ==3.5.1
  • matplotlib_inline ==0.1.3
  • mistune ==0.8.4
  • mock ==4.0.3
  • morfeus-ml ==0.7.1
  • mpmath ==1.2.1
  • multidict ==6.0.2
  • nbclient ==0.5.10
  • nbconvert ==6.4.2
  • nbformat ==5.1.3
  • nest_asyncio ==1.5.4
  • netaddr ==0.8.0
  • netifaces ==0.11.0
  • networkx ==3.0
  • nose ==1.3.7
  • notebook ==6.4.8
  • numpy ==1.22.2
  • oauthlib ==3.2.0
  • openbabel ==3.1.1.1
  • packaging ==21.3
  • pandas ==1.4.0
  • pandocfilters ==1.5.0
  • paramiko ==2.9.2
  • parso ==0.8.3
  • path ==16.3.0
  • path.py ==12.5.0
  • pathlib2 ==2.3.6
  • paycheck ==1.0.2
  • pbr ==5.8.1
  • pexpect ==4.8.0
  • pickleshare ==0.7.5
  • prometheus_client ==0.13.1
  • prompt_toolkit ==3.0.26
  • protobuf ==3.19.4
  • ptyprocess ==0.7.0
  • pyDeprecate ==0.3.2
  • pyasn1 ==0.4.8
  • pyasn1-modules ==0.2.8
  • pycparser ==2.21
  • pydantic ==1.9.1
  • pyparsing ==3.0.7
  • pyrsistent ==0.18.1
  • python-dateutil ==2.8.2
  • pytorch-lightning ==1.7.0
  • pytz ==2021.3
  • pyzmq ==22.3.0
  • qcelemental ==0.24.0
  • rdkit ==2022.3.5
  • requests ==2.27.1
  • requests_oauthlib ==1.3.1
  • rsa ==4.9
  • scikit_learn ==1.1.1
  • scipy ==1.8.0
  • seaborn ==0.11.2
  • selfies ==2.1.1
  • simplegeneric ==0.8.1
  • singledispatch ==3.7.0
  • six ==1.16.0
  • sympy ==1.9
  • tensorboard ==2.9.1
  • tensorboard-data-server ==0.6.1
  • tensorboard_plugin_wit ==1.8.1
  • termcolor ==1.1.0
  • terminado ==0.13.1
  • testpath ==0.5.0
  • threadpoolctl ==3.1.0
  • torch ==1.10.0
  • torchmetrics ==0.9.3
  • tornado ==6.1
  • tqdm ==4.64.0
  • traitlets ==5.0.5
  • typing_extensions ==4.0.1
  • urllib3 ==1.26.8
  • wcwidth ==0.2.5
  • webencodings ==0.5.1
  • widgetsnbextension ==3.5.2
  • xtb-python ==20.1
  • yarl ==1.7.2
  • zipp ==3.7.0