hdbscan
hdbscan: Hierarchical density based clustering - Published in JOSS (2017)
Talisman
Talisman: a JavaScript archive of fuzzy matching, information retrieval and record linkage building blocks - Published in JOSS (2020)
PyClustering
PyClustering: Data Mining Library - Published in JOSS (2019)
UralicNLP
UralicNLP: An NLP Library for Uralic Languages - Published in JOSS (2019)
TopoPyScale
TopoPyScale: A Python Package for Hillslope Climate Downscaling - Published in JOSS (2023)
BetaML
BetaML: The Beta Machine Learning Toolkit, a self-contained repository of Machine Learning algorithms in Julia - Published in JOSS (2021)
ClusterValidityIndices.jl
ClusterValidityIndices.jl: Batch and Incremental Metrics for Unsupervised Learning - Published in JOSS (2022)
OpenRepGrid.ic
OpenRepGrid.ic: A software for Interpretive Clustering - Published in JOSS (2023)
HiPart
HiPart: Hierarchical Divisive Clustering Toolbox - Published in JOSS (2023)
geocmeans
geocmeans: An R package for spatial fuzzy c-means - Published in JOSS (2023)
Persistable
Persistable: persistent and stable clustering - Published in JOSS (2023)
usearch
Fast Open-Source Search & Clustering engine × for Vectors & Arbitrary Objects × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
TimeSeriesClustering
TimeSeriesClustering: An extensible framework in Julia - Published in JOSS (2019)
Local clustering
Local clustering - Published in JOSS (2018)
partycls
partycls: A Python package for structural clustering - Published in JOSS (2021)
visxhclust
visxhclust: An R Shiny package for visual exploration of hierarchical clustering - Published in JOSS (2022)
seg1d
seg1d: A Python package for Automated segmentation of one-dimensional (1D) data - Published in JOSS (2020)
dtaidistance
Time series distances: Dynamic Time Warping (fast DTW implementation in C)
pypots
A Python toolkit/library for reality-centric machine/deep learning and data mining on partially-observed time series, including SOTA neural network models for scientific analysis tasks of imputation/classification/clustering/forecasting/anomaly detection/cleaning on incomplete industrial (irregularly-sampled) multivariate TS with NaN missing values
napari-clusters-plotter
A napari plugin for clustering objects according to their properties.
PyCVI
PyCVI: A Python package for internal Cluster Validity Indices, compatible with time-series data - Published in JOSS (2024)
ALPPACA - A tooL for Prokaryotic Phylogeny And Clustering Analysis
ALPPACA - A tooL for Prokaryotic Phylogeny And Clustering Analysis - Published in JOSS (2022)
gmm_diag and gmm_full
gmm_diag and gmm_full: C++ classes for multi-threaded Gaussian mixture models and Expectation-Maximisation - Published in JOSS (2017)
networkanalysis-ts
TypeScript port of the Java networkanalysis package that provides data structures and algorithms for network analysis.
mlni
Machine Learning in NeuroImaging (MLNI) is a python package that performs various tasks using neuroimaging data.
dedupe
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
publicationclassification
Java package for creating a multi-level classification of scientific publications based on citation links between publications.
uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
vosviewer-online
VOSviewer Online is a tool for network visualization. It is a web-based version of VOSviewer, a popular tool for constructing and visualizing bibliometric networks.
clusttraj
Python script that receives a molecular dynamics or Monte Carlo trajectory and performs agglomerative clustering to classify similar structures.
work-set-clustering
A Python script to perform a clustering based on descriptive keys.
tsam
Time series aggregation module (tsam). Determines typical operation periods or dereases the temporal resolution. Accelerates model or experiment runs.
radius_clustering
Source code repository of the Radius clustering python package.
osmogrid
OSMoGrid is a tool to generate life like electrical grid models based on publicly available data, mainly OpenStreetMap. The tool puts a special focus on low voltage grids.
clustimage
clustimage is a python package for unsupervised clustering of images.
minisom
:red_circle: MiniSom is a minimalistic implementation of the Self Organizing Maps
https://github.com/barahona-research-group/pygenstability
PyGenStability: Multiscale community detection with generalized Markov Stability
d3heatmap
d3heatmap is a Python package to create interactive heatmaps based on d3js.
ClusterAnalysis
Cluster Algorithms from Scratch with Julia Lang. (K-Means and DBSCAN)
https://github.com/alleninstitute/scrattch.hicat
Hierarchical, iterative clustering for analysis of transcriptomics data in R
k-means-constrained
K-Means clustering - constrained with minimum and maximum cluster size. Documentation: https://joshlk.github.io/k-means-constrained
mixmod
Supervised, unsupervised and semi-supervised classification with mixture modelling
SGCP
SGCP: a spectral self-learning method for clustering genes in co-expression networks
genieclust
Genie: Fast and Robust Hierarchical Clustering with Noise Point Detection - in Python and R
iSEE
R/shiny interface for interactive visualization of data in SummarizedExperiment objects
MLJ
MLJ: A Julia package for composable machine learning - Published in JOSS (2020)
unsupervised_analysis
A general purpose Snakemake workflow and MrBiomics module to perform unsupervised analyses (dimensionality reduction & cluster analysis) and visualizations of high-dimensional data.
ExpectationMaximization
A simple but generic implementation of Expectation Maximization algorithms to fit mixture models.
publicationclassificationlabeling
Java package for obtaining labels for clusters of scientific publications.
mob-suite
MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies
ddPCRclust
A package for automated quantification of non-orthogonal, multiplexed ddPCR data
aggmap
Jigsaw-like AggMap: A Robust and Explainable Multi-Channel Omics Deep Learning Tool
dbscan
Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms - R package
digitalcellsorter
Digital Cell Sorter (DCS): single cell RNA-seq analysis toolkit. Documentation:
deepgmm
This package is provides a mixture model based approach for deep learning.
https://github.com/matrix-profile-foundation/matrixprofile
A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
icellr
Single (i) Cell R package (iCellR) is an interactive R package to work with high-throughput single cell sequencing technologies (i.e scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST)).
aricode
R package for computation of (adjusted) rand-index and other such scores
https://github.com/cbg-ethz/jnotype
Probabilistic modeling of high-dimensional binary data in JAX
cica
Code repository of the R package Clusterwise Independent Component Analysis
GMCM
Unsupervised Clustering and Meta-analysis using Gaussian Mixture Copula Models
gap-stat
Dynamically get the suggested clusters in the data for unsupervised learning.
apple-ocr
Easy-to-Use Apple Vision wrapper for text extraction, scalar representation and clustering using K-means.