chemical-dataset-comparator

ChemIcal DatasEt comparatoR (CIDER) is a Python package and ready-to-use Jupyter Notebook workflow which primarily utilizes RDKit to compare two or more chemical structure datasets (SD files) in different aspects, e.g. size, overlap, molecular descriptor distributions, chemical space clustering, etc., most of which can be visually inspected.

https://github.com/steinbeck-lab/chemical-dataset-comparator

Science Score: 75.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
    Organization steinbeck-lab has institutional domain (cheminf.uni-jena.de)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

ChemIcal DatasEt comparatoR (CIDER) is a Python package and ready-to-use Jupyter Notebook workflow which primarily utilizes RDKit to compare two or more chemical structure datasets (SD files) in different aspects, e.g. size, overlap, molecular descriptor distributions, chemical space clustering, etc., most of which can be visually inspected.

Basic Info
  • Host: GitHub
  • Owner: Steinbeck-Lab
  • License: other
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 19.2 MB
Statistics
  • Stars: 9
  • Watchers: 4
  • Forks: 4
  • Open Issues: 3
  • Releases: 1
Created about 4 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

CMS Logo

License Maintenance GitHub issues GitHub contributors GitHub release RDKit badge Workflow DOI Documentation Status PyPI version fury.io

Overview of ChemIcal DatasEt comparatoR (CIDER) :globewithmeridians:

  • ChemIcal DatasEt comparatoR (CIDER) is a Python package and ready-to-use Jupyter Notebook workflow which primarily utilizes RDKit to compare two or more chemical structure datasets (SD files) in different aspects, e.g. size, overlap, molecular descriptor distributions, chemical space clustering, etc., most of which can be visually inspected in the notebook.

Usage

  • To use CIDER, clone the repository to your local disk and make sure you install all the necessary requirements.

We recommend to use CIDER inside a Conda environment to facilitate the installation of the dependencies.

  • Conda can be downloaded as part of the Anaconda or the Miniconda platforms (Python 3.10). We recommend to install miniconda3. Using Linux you can get it with:

shell $ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh $ bash Miniconda3-latest-Linux-x86_64.sh

Installation

shell $ git clone https://github.com/Steinbeck-Lab/ChemIcal_DatasEt_compaRator.git $ cd ChemIcal_DatasEt_compaRator $ conda create --name cider_chem python=3.10 $ conda activate cider_chem $ conda install pip $ python -m pip install -U pip #Upgrade pip $ pip install . - Note: Make sure all installations are working correctly by running the tests. You can do this by running the pytest command in the repository root folder.

Alternative

shell $ python -m pip install -U pip #Upgrade pip $ pip install git+https://github.com/Steinbeck-Lab/ChemIcal_DatasEt_compaRator.git

Install from PyPI

shell $ pip install cider-chem

Basic usage:

```python from CIDER import ChemicalDatasetComparator cider = ChemicalDatasetComparator()

datadir = './data/' # dir with sd files containing molecules testdict = cider.importasdatadict(datadir) cider.getnumberofmolecules(testdict)

```

Documentation

  • The documentation for the CIDER package can be found here.

Cite us

  • Busch, H., Schaub, J., Brinkhaus, H. O., Rajan, K., & Steinbeck, C. (2022). ChemIcal DatasEt comparatoR CIDER (Version 0.0.1-dev) [Computer software]. https://doi.org/10.5281/zenodo.6630494

Maintained by :wrench:

ChemIcal DatasEt comparatoR is developed and maintained by the Steinbeck group at the Friedrich Schiller University Jena, Germany. The code for this web application is released under the MIT license. Copyright © CC-BY-SA 2024

GitHub Logo

Owner

  • Name: Steinbeck-Lab
  • Login: Steinbeck-Lab
  • Kind: organization

Official GitHub organization for the Steinbeck group Jena, Germany

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "Busch"
    given-names: "Hannah"
    orcid: "https://orcid.org/0000-0002-9674-6450"
  - family-names: "Schaub"
    given-names: "Jonas"
    orcid: "https://orcid.org/0000-0003-1554-6666"
  - family-names: "Brinkhaus"
    given-names: "Henning Otto"
    orcid: "https://orcid.org/0000-0002-6664-2183"
  - family-names: "Rajan"
    given-names: "Kohulan"
    orcid: "https://orcid.org/0000-0003-1066-7792"
  - family-names: "Steinbeck"
    given-names: "Christoph"
    orcid: "https://orcid.org/0000-0001-6966-0814"
title: "ChemIcal DatasEt comparatoR CIDER"
version: 1.0.0
doi: 10.5281/??
date-released: 2022.06.09
url: "https://github.com/hannbus/ChemIcal_DatasEt_compaRator"
license: MIT

GitHub Events

Total
  • Watch event: 2
Last Year
  • Watch event: 2

Dependencies

setup.py pypi
  • FPDF ==1.7.2
  • chemplot ==1.2.0
  • matplotlib ==3.5.1
  • matplotlib_venn ==0.11.6
  • notebook *
  • rdkit-pypi *
  • seaborn ==0.11.2
.github/workflows/main.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
docs/requirements.txt pypi
  • FPDF ==1.7.2
  • IPython *
  • bokeh ==2.4.3
  • chemplot ==1.2.0
  • matplotlib ==3.5.1
  • matplotlib_venn ==0.11.6
  • nbsphinx *
  • rdkit-pypi *
  • seaborn ==0.11.2
  • sphinx-autodoc-typehints *
  • sphinx_rtd_theme *