Bacting
Bacting: a next generation, command line version of Bioclipse - Published in JOSS (2021)
Project RACCOON
Project RACCOON: Automated construction of PDB files for polymers and polymer peptide conjugates - Published in JOSS (2024)
cheminformatics-microservice
This set of essential and valuable microservices is designed to be accessed via API calls to support cheminformatics.
lindemann
lindemann is a python package to calculate the Lindemann index of a lammps trajectory
onglai-classify-homologues
OngLai: A cheminformatics algorithm to classify homologous chemical series
https://github.com/lukasturcani/stk
A Python library which allows construction and manipulation of complex molecules, as well as automatic molecular design and the creation of molecular databases.
thermo
Thermodynamics and Phase Equilibrium component of Chemical Engineering Design Library (ChEDL)
chromatographr
Toolset for the reproducible analysis of chromatography data in R (HPLC-DAD/UV, GC-FID).
patentchem
Downloads USPTO patents and finds molecules related to keyword queries
structure-seer
The implementation, training and evaluation of a Structure Seer machine learning model designed for reconstruction of adjacency of a molecular graph from the labelling of its nodes.
https://github.com/althonos/pubchem.rs
Rust data structures and client for the PubChem REST API
moldrug
moldrug (AKA mouse) is a Python package for drug-oriented optimization on the chemical space
https://github.com/biopragmatics/biolookup
🔍 The Biolookup Service retrieves metadata and ontological information about biomedical entities.
https://github.com/chiang-yuan/llamp
A web app and Python API for multi-modal RAG framework to ground LLMs on high-fidelity materials informatics. An agentic materials scientist powered by @materialsproject, @langchain-ai, and @openai
https://github.com/awslabs/dgl-lifesci
Python package for graph neural networks in chemistry and biology
https://github.com/aspuru-guzik-group/dionysus
For analysis of calibration, performance, and generalizability of probabilistic models on small molecular datasets. Paper on RSC Digital Discovery: https://pubs.rsc.org/en/content/articlehtml/2023/dd/d2dd00146b
https://github.com/cdk/cdk-paper-3
Repository with the Latex source code for the CDK III paper.
alinemol
Exploring performance of machine learning model on out-of-distribution data in chemical domain
https://github.com/labimotion/labimotion
LabIMotion: Where Adaptability Meets Innovation in Scientific Solutions.
mpox-kg
Source code and data repository for paper titled "Monkeypox Knowledge Graph: A comprehensive representation embedding chemical entities and associated biology of Monkeypox "
https://github.com/a-r-j/drugtranslator
R package for translating between drug identifiers using the Chemical Translation Service (CTS)
coconut
COCONUT (COlleCtion of Open Natural prodUcTs): A comprehensive platform facilitating natural product research by providing data, tools, and services for deposition, curation, and reuse.
https://github.com/cbouy/molhighlighter
Multicolored substructure highlights made easy
https://github.com/cdk/cdkbook
Groovy Cheminformatics with the Chemistry Development Kit
https://github.com/pierrehirel/atomsk
Atomsk: A Tool For Manipulating And Converting Atomic Data Files -
dftbonddependency
Repository to calculate bond based correction to reaction energy from low-level DFT
https://github.com/cvigilv/simspread
De novo target prediction by chemical similarity-guided network-based inference
cloud-surge
Massively parallel computation of chemical spaces in cloud environments with surge
SimSpread
SimSpread is a novel approach for predicting interactions between two distinct set of nodes, query and target nodes, using a similarity measure vector between query nodes as a meta-description in combination with the network-based inference for link prediction.
https://github.com/biocore/q2-qemistree
Hierarchical orderings for mass spectrometry data. Canonically pronounced "chemis-tree".
https://github.com/cdk/cdk-paper-2
The green Open Access version of the second CDK paper.
global-chem
A Knowledge Graph of Common Chemical Names to their Molecular Definition
rna-ligand-based
Evaluation of ligand based methods applied to RNA and DNA targets
BESMARTS
BESMARTS: A toolkit for data-driven force field design using binary-encoded SMARTS - Published in JOSS (2025)
https://github.com/cthoyt/cthoyt.github.io
My personal website, served at https://cthoyt.com