gt4sd

GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.

https://github.com/gt4sd/gt4sd-core

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: nature.com, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (19.7%) to scientific vocabulary

Keywords

deep-learning generative-models machine-learning python
Last synced: 6 months ago · JSON representation ·

Repository

GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.

Basic Info
Statistics
  • Stars: 362
  • Watchers: 15
  • Forks: 80
  • Open Issues: 0
  • Releases: 60
Topics
deep-learning generative-models machine-learning python
Created about 4 years ago · Last pushed about 1 year ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

GT4SD (Generative Toolkit for Scientific Discovery)

PyPI version Actions tests License: MIT Code style: black Contributions Docs Total downloads Monthly downloads Binder DOI 2022 IEEE Open Software Services Award Paper DOI: 10.1038/s41524-023-01028-1

logo

The GT4SD (Generative Toolkit for Scientific Discovery) is an open-source platform to accelerate hypothesis generation in the scientific discovery process. It provides a library for making state-of-the-art generative AI models easier to use.

For full details on the library API and examples see the docs. Almost all pretrained models are also available via gradio-powered web apps on Hugging Face Spaces.

Installation

Requirements

Currently gt4sd relies on:

  • python>=3.7,<=3.10
  • pip==24.0

If you need others, help us by contributing to the project.

Conda

The recommended way to install the gt4sd is to create a dedicated conda environment, this will ensure all requirements are satisfied. For CPU:

sh git clone https://github.com/GT4SD/gt4sd-core.git cd gt4sd-core/ conda env create -f conda_cpu_mac.yml # for linux use conda_cpu_linux.yml conda activate gt4sd pip install gt4sd

NOTE 1: By default gt4sd is installed with CPU requirements. For GPU usage replace with:

sh conda env create -f conda_gpu.yml

NOTE 2: In case you want to reuse an existing compatible environment (see requirements), you can use pip, but as of now (:eyes: on issue for changes), some dependencies require installation from GitHub, so for a complete setup install them with:

sh pip install -r vcs_requirements.txt

A few VCS dependencies require Git LFS (make sure it's available on your system).

Development setup & installation

If you would like to contribute to the package, we recommend to install gt4sd in editable mode inside your conda environment:

sh pip install --no-deps -e .

Learn more in CONTRIBUTING.md

Getting started

After install you can use gt4sd right away in your discovery workflows.

logo

Running inference pipelines in your python code

Running an algorithm is as easy as typing:

```python from gt4sd.algorithms.conditionalgeneration.paccmannrl.core import ( PaccMannRLProteinBasedGenerator, PaccMannRL ) target = 'MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTT'

algorithm configuration with default parameters

configuration = PaccMannRLProteinBasedGenerator()

instantiate the algorithm for sampling

algorithm = PaccMannRL(configuration=configuration, target=target) items = list(algorithm.sample(10)) print(items) ```

Or you can use the ApplicationRegistry to run an algorithm instance using a serialized representation of the algorithm:

python from gt4sd.algorithms.registry import ApplicationsRegistry target = 'MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTT' algorithm = ApplicationsRegistry.get_application_instance( target=target, algorithm_type='conditional_generation', domain='materials', algorithm_name='PaccMannRL', algorithm_application='PaccMannRLProteinBasedGenerator', generated_length=32, # include additional configuration parameters as **kwargs ) items = list(algorithm.sample(10)) print(items)

Running inference pipelines via the CLI command

GT4SD can run inference pipelines based on the gt4sd-inference CLI command. It allows to run all inference algorithms directly from the command line. To see which algorithms are available and how to use the CLI for your favorite model, check out examples/cli/README.md.

You can run inference pipelines simply typing:

console gt4sd-inference --algorithm_name PaccMannRL --algorithm_application PaccMannRLProteinBasedGenerator --target MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTT --number_of_samples 10

The command supports multiple parameters to select an algorithm and configure it for inference:

```console $ gt4sd-inference --help usage: gt4sd-inference [-h] [--algorithmtype ALGORITHMTYPE] [--domain DOMAIN] [--algorithmname ALGORITHMNAME] [--algorithmapplication ALGORITHMAPPLICATION] [--algorithmversion ALGORITHMVERSION] [--target TARGET] [--numberofsamples NUMBEROFSAMPLES] [--configurationfile CONFIGURATIONFILE] [--printinfo [PRINTINFO]]

optional arguments: -h, --help show this help message and exit --algorithmtype ALGORITHMTYPE Inference algorithm type, supported types: conditionalgeneration, controlledsampling, generation, prediction. (default: None) --domain DOMAIN Domain of the inference algorithm, supported types: materials, nlp. (default: None) --algorithmname ALGORITHMNAME Inference algorithm name. (default: None) --algorithmapplication ALGORITHMAPPLICATION Inference algorithm application. (default: None) --algorithmversion ALGORITHMVERSION Inference algorithm version. (default: None) --target TARGET Optional target for generation represented as a string. Defaults to None, it can be also provided in the configurationfile as an object, but the commandline takes precendence. (default: None) --numberofsamples NUMBEROFSAMPLES Number of generated samples, defaults to 5. (default: 5) --configurationfile CONFIGURATIONFILE Configuration file for the inference pipeline in JSON format. (default: None) --printinfo [PRINT_INFO] Print info for the selected algorithm, preventing inference run. Defaults to False. (default: False) ```

You can use gt4sd-inference to directly get information on the configuration parameters for the selected algorithm:

console gt4sd-inference --algorithm_name PaccMannRL --algorithm_application PaccMannRLProteinBasedGenerator --print_info INFO:gt4sd.cli.inference:Selected algorithm: {'algorithm_type': 'conditional_generation', 'domain': 'materials', 'algorithm_name': 'PaccMannRL', 'algorithm_application': 'PaccMannRLProteinBasedGenerator', 'algorithm_version': 'v0'} INFO:gt4sd.cli.inference:Selected algorithm support the following configuration parameters: { "batch_size": { "description": "Batch size used for the generative model sampling.", "title": "Batch Size", "default": 32, "type": "integer", "optional": true }, "temperature": { "description": "Temperature parameter for the softmax sampling in decoding.", "title": "Temperature", "default": 1.4, "type": "number", "optional": true }, "generated_length": { "description": "Maximum length in tokens of the generated molcules (relates to the SMILES length).", "title": "Generated Length", "default": 100, "type": "integer", "optional": true } } Target information: { "target": { "title": "Target protein sequence", "description": "AA sequence of the protein target to generate non-toxic ligands against.", "type": "string" } }

Running training pipelines via the CLI command

GT4SD provides a trainer client based on the gt4sd-trainer CLI command.

The trainer currently supports the following training pipelines:

  • language-modeling-trainer: language modelling via HuggingFace transfomers and PyTorch Lightning.
  • paccmann-vae-trainer: PaccMann VAE models.
  • granular-trainer: multimodal compositional autoencoders supporting MLP, RNN and Transformer layers.
  • guacamol-lstm-trainer: GuacaMol LSTM models.
  • moses-organ-trainer: Moses Organ implementation.
  • moses-vae-trainer: Moses VAE models.
  • torchdrug-gcpn-trainer: TorchDrug Graph Convolutional Policy Network model.
  • torchdrug-graphaf-trainer: TorchDrug autoregressive GraphAF model.
  • diffusion-trainer: Diffusers model.
  • gflownet-trainer: GFlowNet model.

```console $ gt4sd-trainer --help usage: gt4sd-trainer [-h] --trainingpipelinename TRAININGPIPELINENAME [--configurationfile CONFIGURATIONFILE]

optional arguments: -h, --help show this help message and exit --trainingpipelinename TRAININGPIPELINENAME Training type of the converted model, supported types: granular-trainer, language-modeling-trainer, paccmann- vae-trainer. (default: None) --configurationfile CONFIGURATIONFILE Configuration file for the trainining. It can be used to completely by-pass pipeline specific arguments. (default: None) ```

To launch a training you have two options.

You can either specify the training pipeline and the path of a configuration file that contains the needed training parameters:

sh gt4sd-trainer --training_pipeline_name ${TRAINING_PIPELINE_NAME} --configuration_file ${CONFIGURATION_FILE}

Or you can provide directly the needed parameters as arguments:

sh gt4sd-trainer --training_pipeline_name language-modeling-trainer --type mlm --model_name_or_path mlm --training_file /path/to/train_file.jsonl --validation_file /path/to/valid_file.jsonl

To get more info on a specific training pipeleins argument simply type:

sh gt4sd-trainer --training_pipeline_name ${TRAINING_PIPELINE_NAME} --help

Saving a trained algorithm for inference via the CLI command

Once a training pipeline has been run via the gt4sd-trainer, it's possible to save the trained algorithm via gt4sd-saving for usage in compatible inference pipelines.

Here a small example for PaccMannGP algorithm (paper).

You can train a model with gt4sd-trainer (quick training using few data, not really recommended for a realistic model :warning:):

sh gt4sd-trainer --training_pipeline_name paccmann-vae-trainer --epochs 250 --batch_size 4 --n_layers 1 --rnn_cell_size 16 --latent_dim 16 --train_smiles_filepath src/gt4sd/training_pipelines/tests/molecules.smi --test_smiles_filepath src/gt4sd/training_pipelines/tests/molecules.smi --model_path /tmp/gt4sd-paccmann-gp/ --training_name fast-example --eval_interval 15 --save_interval 15 --selfies

Save the model with the compatible inference pipeline using gt4sd-saving:

sh gt4sd-saving --training_pipeline_name paccmann-vae-trainer --model_path /tmp/gt4sd-paccmann-gp/ --training_name fast-example --target_version fast-example-v0 --algorithm_application PaccMannGPGenerator

Run the algorithm via gt4sd-inference (again the model produced in the example is trained on dummy data and will give dummy outputs, do not use it as is :no_good:):

sh gt4sd-inference --algorithm_name PaccMannGP --algorithm_application PaccMannGPGenerator --algorithm_version fast-example-v0 --number_of_samples 5 --target '{"molwt": {"target": 60.0}}'

Uploading a trained algorithm on a public hub via the CLI command

You can upload trained and finetuned models easily in the public hub using gt4sd-upload. The syntax follows the saving pipeline:

sh gt4sd-upload --training_pipeline_name paccmann-vae-trainer --model_path /tmp/gt4sd-paccmann-gp --training_name fast-example --target_version fast-example-v0 --algorithm_application PaccMannGPGenerator

NOTE: GT4SD can be configured to upload models to a custom or self-hosted COS. An example on self-hosting locally a COS (minio) where to upload your models can be found here.

Computing properties

You can compute properties of your generated samples using the gt4sd.properties submodule:

```python

from gt4sd.properties import PropertyPredictorRegistry similaritypredictor = PropertyPredictorRegistry.getpropertypredictor("similarityseed", {"smiles" : "C1=CC(=CC(=C1)Br)CN"}) similarity_predictor("CCO") 0.0333

let's inspect what other parameters we can set for similarity measuring

similaritypredictor = PropertyPredictorRegistry.getpropertypredictor("similarityseed", {"smiles" : "C1=CC(=CC(=C1)Br)CN", "fpkey": "ECFP6"}) similaritypredictor("CCO")

inspect parameters

PropertyPredictorRegistry.getpropertypredictorparametersschema("similarityseed") '{"title": "SimilaritySeedParameters", "description": "Abstract class for property computation.", "type": "object", "properties": {"smiles": {"title": "Smiles", "example": "c1ccccc1", "type": "string"}, "fpkey": {"title": "Fp Key", "default": "ECFP4", "type": "string"}}, "required": ["smiles"]}'

predict other properties

qed = PropertyPredictorRegistry.getpropertypredictor("qed") qed('CCO') 0.4068

list properties

PropertyPredictorRegistry.listavailable() ['activityagainsttarget', 'aliphaticity', ... 'scscore', 'similarityseed', 'tpsa', 'weight'] ```

Additional examples

Find more examples in notebooks

You can play with them right away using the provided Dockerfile, simply build the image and run it to explore the examples using Jupyter:

sh docker build -f Dockerfile -t gt4sd-demo . docker run -p 8888:8888 gt4sd-demo

Supported packages

Beyond implementing various generative modeling inference and training pipelines GT4SD is designed to provide a high-level API that implement an harmonized interface for several existing packages:

  • GuacaMol: inference pipelines for the baselines models and training pipelines for LSTM models.
  • Moses: inference pipelines for the baselines models and training pipelines for VAEs and Organ.
  • TorchDrug: inference and training pipelines for GCPN and GraphAF models. Training pipelines support custom datasets as well as datasets native in TorchDrug.
  • MoLeR: inference pipelines for MoLeR (MOlecule-LEvel Representation) generative models for de-novo and scaffold-based generation.
  • TAPE: encoder modules compatible with the protein language models.
  • PaccMann: inference pipelines for all algorithms of the PaccMann family as well as training pipelines for the generative VAEs.
  • transformers: training and inference pipelines for generative models from HuggingFace Models
  • diffusers: training and inference pipelines for generative models from Diffusers Models
  • GFlowNets: training and inference pipeline for Generative Flow Networks
  • MolGX: training and inference pipelines to generate small molecules satisfying target properties. The full implementation of MolGX, including additional functionalities, is available here.
  • Regression Transformers: training and inference pipelines to generate small molecules, polymers or peptides based on numerical property constraints. For details read the paper.

References

If you use gt4sd in your projects, please consider citing the following:

```bib @software{GT4SD, author = {GT4SD Team}, month = {2}, title = {{GT4SD (Generative Toolkit for Scientific Discovery)}}, url = {https://github.com/GT4SD/gt4sd-core}, version = {main}, year = {2022} }

@article{manica2022gt4sd, title={Accelerating material design with the generative toolkit for scientific discovery}, author={Manica, Matteo and Born, Jannis and Cadow, Joris and Christofidellis, Dimitrios and Dave, Ashish and Clarke, Dean and Teukam, Yves Gaetan Nana and Giannone, Giorgio and Hoffman, Samuel C and Buchan, Matthew and others}, journal={npj Computational Materials}, volume={9}, number={1}, pages={69}, year={2023}, publisher={Nature Publishing Group UK London} } ```

License

The gt4sd codebase is under MIT license. For individual model usage, please refer to the model licenses found in the original packages.

Owner

  • Name: Generative Toolkit 4 Scientific Discovery
  • Login: GT4SD
  • Kind: organization
  • Email: gt4sd-core@zurich.ibm.com

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use GT4SD, please consider citing as below."
authors:
  - family-names: Team
    given-names: GT4SD
title: "GT4SD (Generative Toolkit for Scientific Discovery)"
version: 1.3.1
url: "https://github.com/GT4SD/gt4sd-core"
doi: "10.5281/zenodo.6673577"
date-released: 2022-02-11
license:
  - name: MIT License
    url: https://opensource.org/licenses/MIT
references:
  - title: Accelerating material design with the generative toolkit for scientific discovery
    authors:
      - family-names: Manica
        given-names: Matteo
      - family-names: Born
        given-names: Jannis
      - family-names: Cadow
        given-names: Joris
      - family-names: Christofidellis
        given-names: Dimitrios
      - family-names: Dave
        given-names: Ashish
      - family-names: Clarke
        given-names: Dean
      - family-names: Teukam
        given-names: Yves Gaetan Nana
      - family-names: Giannone
        given-names: Giorgio
      - family-names: Hoffman
        given-names: Samuel C
      - family-names: Buchan
        given-names: Matthew
      - family-names: Chenthamarakshan
        given-names: Vijil
      - family-names: Donovan
        given-names: Timothy
      - family-names: Hsu
        given-names: Hsiang Han
      - family-names: Zipoli
        given-names: Federico
      - family-names: Schilter
        given-names: Oliver
      - family-names: Kishimoto
        given-names: Akihiro
      - family-names: Hamada
        given-names: Lisa
      - family-names: Padhi
        given-names: Inkit
      - family-names: Wehden
        given-names: Karl
      - family-names: McHugh
        given-names: Lauren
      - family-names: Khrabrov
        given-names: Alexy
      - family-names: Das
        given-names: Payel
      - family-names: Takeda
        given-names: Seiji
      - family-names: Smith
        given-names: John R.
    journal: npj Computational Materials
    volume: 9
    number: 1
    pages: 69
    year: 2023
    publisher: Nature Publishing Group

GitHub Events

Total
  • Create event: 3
  • Release event: 3
  • Issues event: 10
  • Watch event: 23
  • Issue comment event: 20
  • Push event: 14
  • Pull request review event: 4
  • Pull request review comment event: 3
  • Pull request event: 3
  • Fork event: 7
Last Year
  • Create event: 3
  • Release event: 3
  • Issues event: 10
  • Watch event: 23
  • Issue comment event: 20
  • Push event: 14
  • Pull request review event: 4
  • Pull request review comment event: 3
  • Pull request event: 3
  • Fork event: 7

Committers

Last synced: 11 months ago

All Time
  • Total Commits: 297
  • Total Committers: 20
  • Avg Commits per committer: 14.85
  • Development Distribution Score (DDS): 0.451
Past Year
  • Commits: 15
  • Committers: 4
  • Avg Commits per committer: 3.75
  • Development Distribution Score (DDS): 0.6
Top Committers
Name Email Commits
Matteo Manica d****g@g****m 163
Jannis Born j****b@z****m 63
Dimitrios Christofidellis 7****d 19
Giorgio Giannone g****e@i****m 12
GitHub Action g****d@z****m 10
Joris j****w@g****m 6
GitHub Action g****s@g****m 5
Matteo Manica t****e@z****m 3
Karl Wehden K****l@W****m 3
Yves Gaetan Nana Teukam 5****a 3
Dean Elzinga 1****h 1
Ashish Dave A****8@g****m 1
Alain Vaucher a****r 1
Akihiro Kishimoto a****o@g****m 1
Helena Montenegro 3****o 1
Nicolai Ree 5****e 1
YoelShoshan y****n@g****m 1
edux300 e****0@g****m 1
federicozipoli 5****i 1
mirunacrt 8****t 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 51
  • Total pull requests: 68
  • Average time to close issues: 4 months
  • Average time to close pull requests: 9 days
  • Total issue authors: 24
  • Total pull request authors: 9
  • Average comments per issue: 2.43
  • Average comments per pull request: 0.94
  • Merged pull requests: 52
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 5
  • Pull requests: 5
  • Average time to close issues: 7 days
  • Average time to close pull requests: about 12 hours
  • Issue authors: 5
  • Pull request authors: 2
  • Average comments per issue: 2.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • jannisborn (12)
  • drugilsberg (8)
  • fiskrt (4)
  • davidegraff (3)
  • itaijj (3)
  • AsyaOrlova (2)
  • palazzef (2)
  • elzinga-ibm-research (2)
  • Rwpnd (1)
  • helderlopes97 (1)
  • bransom960 (1)
  • SabariKumar (1)
  • zw-SIMM (1)
  • Jenonone (1)
  • mirunacrt (1)
Pull Request Authors
  • jannisborn (35)
  • yvesnana (18)
  • drugilsberg (12)
  • christofid (10)
  • NicolaiRee (3)
  • fiskrt (2)
  • mirunacrt (1)
  • elzinga-ibm-research (1)
  • helderlopes97 (1)
Top Labels
Issue Labels
enhancement (17) bug (13) cla-signing (9) cla-signed (6) help wanted (5) good first issue (2) documentation (2) refactoring (1) question (1) torchdrug-stalled (1) wontfix (1)
Pull Request Labels
cla-signed (68) enhancement (9) invalid (4) bug (3) ci (1)

Packages

  • Total packages: 3
  • Total downloads:
    • pypi 1,175 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 7
    (may contain duplicates)
  • Total versions: 259
  • Total maintainers: 2
proxy.golang.org: github.com/gt4sd/gt4sd-core
  • Versions: 86
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.5%
Average: 6.7%
Dependent repos count: 6.9%
Last synced: 6 months ago
proxy.golang.org: github.com/GT4SD/gt4sd-core
  • Versions: 86
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.5%
Average: 6.7%
Dependent repos count: 6.9%
Last synced: 6 months ago
pypi.org: gt4sd

Generative Toolkit for Scientific Discovery (GT4SD).

  • Versions: 87
  • Dependent Packages: 0
  • Dependent Repositories: 7
  • Downloads: 1,175 Last month
Rankings
Dependent repos count: 5.5%
Average: 7.9%
Downloads: 8.1%
Dependent packages count: 10.1%
Maintainers (2)
Last synced: 6 months ago

Dependencies

.github/workflows/cla-signature.yaml actions
  • actions/checkout v2 composite
  • ad-m/github-push-action master composite
  • andymckay/labeler e6c4322d0397f3240f0e7e30a33b5c5df2d39e90 composite
  • peter-evans/close-issue v1 composite
.github/workflows/docs.yaml actions
  • actions/checkout v2 composite
  • ad-m/github-push-action master composite
  • conda-incubator/setup-miniconda v2 composite
.github/workflows/pypi.yaml actions
  • actions/checkout master composite
  • actions/setup-python v1 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/tests.yaml actions
  • actions/checkout v2 composite
  • conda-incubator/setup-miniconda v2 composite
Dockerfile docker
  • drugilsberg/gt4sd-base v1.0.0-cpu build
cpu_requirements.txt pypi
  • torch >=1.0,<=1.12.1
  • torch-cluster <=1.6.0
  • torch-geometric <=2.0.4
  • torch-sparse <=0.6.15
dev_requirements.txt pypi
  • better-apidoc ==0.3.1 development
  • black ==22.3.0 development
  • docutils ==0.17.1 development
  • flake8 ==3.8.4 development
  • flask ==1.1.2 development
  • flask_login ==0.5.0 development
  • isort ==5.7.0 development
  • jinja2 <3.1.0 development
  • licenseheaders ==0.8.8 development
  • mypy ==0.950 development
  • myst-parser ==0.13.3 development
  • pytest ==6.1.1 development
  • pytest-cov ==2.10.1 development
  • sphinx ==3.4.3 development
  • sphinx-autodoc-typehints ==1.11.1 development
  • sphinx_rtd_theme ==0.5.1 development
extras_requirements.txt pypi
  • cogmol_inference ==0.6.0
gpu_requirements.txt pypi
  • torch >=1.0,<=1.12.1
  • torch-cluster <=1.6.0
  • torch-geometric <=2.0.4
  • torch-sparse <=0.6.15
notebooks/requirements.txt pypi
  • gt4sd >=0.49.0
  • ipywidgets <8
  • jupyter ==1.0.0
  • mols2grid ==0.2.0
  • pandas >=1.0.0
  • polling2 ==0.5.0
  • rxn4chemistry ==1.6.1
  • seaborn ==0.11.2
  • tqdm ==4.62.3
requirements.txt pypi
  • PyTDC ==0.3.7
  • accelerate >=0.12
  • datasets >=1.11.0
  • diffusers <=0.6.0
  • importlib-resources >=5.10.0
  • ipaddress >=1.0.23
  • joblib >=1.1.0
  • keras >=2.3.1,<2.11.0
  • keybert >=0.7.0
  • minio ==7.0.1
  • modlamp >=4.0.0
  • molecule-generation >=0.3.0
  • molgx >=0.22.0a1
  • nglview >=3.0.3
  • numpy >=1.16.5,<1.24.0
  • protobuf <3.20
  • pyarrow <=6.0.1
  • pydantic >=1.7.3
  • pytorch_lightning <=1.7.7
  • pyyaml >=5.4.1
  • rdkit-pypi >=2020.9.5.2,<=2021.9.4
  • regex >=2.5.91
  • reinvent-chemistry ==0.0.38
  • sacremoses >=0.0.41
  • scikit-learn >=1.0.0
  • scikit-optimize >=0.8.1
  • scipy >=1.0.0
  • sentencepiece >=0.1.95
  • sympy >=1.10.1
  • tables >=3.7.0
  • tape-proteins >=0.4
  • tensorboard *
  • tensorflow >=2.1.0
  • torchdrug >=0.2.0
  • torchmetrics >=0.7.0
  • torchvision >=0.12.0
  • transformers >=4.22.0
  • typing_extensions >=3.7.4.3
  • wheel >=0.26
pyproject.toml pypi
setup.py pypi
vcs_requirements.txt pypi