dimorphite-dl

Adds or removes hydrogen atoms to achieve the appropriate molecular protonation state for a user-specified pH range

https://github.com/durrantlab/dimorphite_dl

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 12 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.2%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Adds or removes hydrogen atoms to achieve the appropriate molecular protonation state for a user-specified pH range

Basic Info
Statistics
  • Stars: 39
  • Watchers: 2
  • Forks: 10
  • Open Issues: 1
  • Releases: 10
Created almost 3 years ago · Last pushed 10 months ago
Metadata Files
Readme Changelog License Code of conduct Codeowners Zenodo

README.md

dimorphite_dl

Adds hydrogen atoms to molecular representations as specified by pH

Build Status PyPI - Python Version codecov GitHub release (latest by date) PyPI - Downloads License GitHub repo size DOI Archived | https://doi.org/10.5281/zenodo.15486131

Dimorphite-DL is a fast, accurate, accessible, and modular open-source program designed for enumerating small-molecule ionization states. It specifically adds or removes hydrogen atoms from molecular representations to achieve the appropriate protonation state for a user-specified pH range.

Accurate protonation states are crucial in cheminformatics and computational drug discovery, as a molecule's ionization state significantly impacts its physicochemical properties, biological activity, and interactions with targets. Dimorphite-DL addresses this by providing a robust solution for preparing molecules for various downstream applications like docking, molecular dynamics, and virtual screening.

Installation

You can install the latest released version on PyPI using the following command.

bash pip install dimorphite_dl

Or you can install the latest development version from the main branch on GitHub using

bash pip install https://github.com/durrantlab/dimorphite_dl.git

Usage

CLI

The command-line interface (dimorphite_dl) provides straightforward access to Dimorphite-DL's functionalities.

Positional Arguments:

  • SMI: SMILES string or path to a file containing SMILES strings to protonate.

Options:

  • --ph_min MIN: Minimum pH to consider (default: 6.4).
  • --ph_max MAX: Maximum pH to consider (default: 8.4).
  • --precision PRE: pKa precision factor, representing the number of standard deviations from the mean pKa to consider when determining ionization states (default: 1.0).
  • --output_file FILE: Optional path to a file to write the protonated SMILES results.
  • --max_variants MXV: Limits the number of protonation variants generated per input compound (default: 128).
  • --label_states: If set, output SMILES will be labeled with their target ionization state ("DEPROTONATED", "PROTONATED", or "BOTH").
  • --log_level: Enable logging and set the level. Can be none, debug, info, warning, error, or critical. Defaults to no logging.

Examples

Protonate molecules from a file:

bash dimorphite_dl sample_molecules.smi

Protonate a single SMILES string within a specific pH range:

bash dimorphite_dl --ph_min -3.0 --ph_max -2.0 "CCC(=O)O"

Protonate a SMILES string and save output to a file:

bash dimorphite_dl --ph_min -3.0 --ph_max -2.0 --output_file output.smi "CCCN"

Protonate molecules from a file with increased pKa precision and state labels:

bash dimorphite_dl --precision 2.0 --label_states sample_molecules.smi

Scripting

Dimorphite-DL can be easily integrated into your Python scripts. The primary function for this is protonate_smiles from dimorphite_dl.protonate.

```python from dimorphitedl import protonatesmiles

Protonate a single SMILES string with custom pH range and precision

protonatedmol1: list[str] = protonatesmiles( "CCC(=O)O", phmin=6.8, phmax=7.9, precision=0.5 ) print(f"Protonated 'CCC(=O)O': {protonatedmol_1}")

Protonate a list of SMILES strings

protonatedmollist: list[str] = protonatesmiles(["CCC(=O)O", "CCCN"]) print(f"Protonated list: {protonatedmol_list}")

Protonate molecules from a SMILES file

Make sure '~/example.smi' exists and contains SMILES strings

protonatedfromfile: list[str] = protonate_smiles("~/example.smi")

print(f"Protonated from file: {protonatedfromfile}")

Example with labeling states and limiting variants

protonatedlabeled: list[str] = protonatesmiles( "C1CCCCC1C(=O)O", phmin=7.0, phmax=7.4, labelstates=True, maxvariants=5 ) print(f"Protonated with labels: {protonated_labeled}") ```

Known issues

Dimorphite_dl is designed to handle the vast majority of ionizable functional groups accurately, but there are some edge cases where the current SMARTS patterns and pKa assignments may not behave as expected. The following are known limitations that users should be aware of when working with specific molecular substructures:

  • Tertiary Amides: Tertiary amides (e.g., N-acetylpiperidine CC(=O)N1CCCCC1) are incorrectly treated as basic amines (pKa ~8) instead of neutral species because current amide SMARTS patterns require an N-H bond.
  • Indoles and Pyrroles: These heterocycles are correctly deprotonated around pH 14.5 but are not protonated at very low pH (~-3.5) where they would be expected to protonate under extremely acidic conditions.

Development

We use pixi to manage Python environments and simplify the developer workflow. Once you have pixi installed, move into dimorphite_dl directory (e.g., cd dimorphite_dl) and install the environment using the command

bash pixi install

Now you can activate the new virtual environment using

sh pixi shell

Citation

If you use Dimorphite-DL in your research, please cite:

Ropp PJ, Kaminsky JC, Yablonski S, Durrant JD (2019) Dimorphite-DL: An open-source program for enumerating the ionization states of drug-like small molecules. J Cheminform 11:14. doi: 10.1186/s13321-019-0336-9.

License

This project is released under the Apache-2.0 License as specified in LICENSE.md.

Owner

  • Name: Jacob D. Durrant
  • Login: durrantlab
  • Kind: user
  • Company: University of Pittsburgh

Dr. Jacob D. Durrant, PhD, is an associate professor of Biological Sciences at the University of Pittsburgh.

GitHub Events

Total
  • Create event: 6
  • Release event: 2
  • Issues event: 7
  • Watch event: 21
  • Delete event: 9
  • Member event: 2
  • Issue comment event: 12
  • Push event: 28
  • Pull request event: 3
  • Fork event: 3
Last Year
  • Create event: 6
  • Release event: 2
  • Issues event: 7
  • Watch event: 21
  • Delete event: 9
  • Member event: 2
  • Issue comment event: 12
  • Push event: 28
  • Pull request event: 3
  • Fork event: 3

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 5
  • Total pull requests: 2
  • Average time to close issues: 10 months
  • Average time to close pull requests: 7 months
  • Total issue authors: 5
  • Total pull request authors: 2
  • Average comments per issue: 2.2
  • Average comments per pull request: 2.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 4
  • Pull requests: 1
  • Average time to close issues: 5 months
  • Average time to close pull requests: 4 minutes
  • Issue authors: 4
  • Pull request authors: 1
  • Average comments per issue: 2.5
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • willfinnigan (1)
  • phonglam3103 (1)
  • lucyraven (1)
  • mabdulhameed (1)
  • redaschi (1)
Pull Request Authors
  • aalexmmaldonado (1)
  • amorehead (1)
Top Labels
Issue Labels
enhancement (1) bug (1)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 2,923 last-month
  • Total dependent packages: 1
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 12
  • Total maintainers: 1
pypi.org: dimorphite-dl

Adds hydrogen atoms to molecular representations as specified by pH

  • Versions: 11
  • Dependent Packages: 1
  • Dependent Repositories: 1
  • Downloads: 2,923 Last month
Rankings
Dependent packages count: 3.2%
Downloads: 17.7%
Average: 18.0%
Dependent repos count: 21.6%
Forks count: 22.6%
Stargazers count: 25.0%
Maintainers (1)
Last synced: 10 months ago
conda-forge.org: dimorphite-dl
  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 34.0%
Dependent packages count: 51.2%
Average: 51.6%
Stargazers count: 60.1%
Forks count: 61.1%
Last synced: 10 months ago

Dependencies

.github/workflows/codecov.yml actions
  • actions/checkout v4 composite
  • codecov/codecov-action v5 composite
  • prefix-dev/setup-pixi v0.8.8 composite
.github/workflows/docs.yml actions
  • actions/checkout v4 composite
  • actions/configure-pages v5 composite
  • actions/deploy-pages v4 composite
  • actions/upload-pages-artifact v3 composite
  • prefix-dev/setup-pixi v0.8.8 composite
.github/workflows/tests.yml actions
  • actions/checkout v4 composite
  • prefix-dev/setup-pixi v0.8.8 composite
pyproject.toml pypi
  • loguru >=0.7.2,<0.8
  • rdkit >=2020.3.3,<2026