PyCM
PyCM: Multiclass confusion matrix library in Python - Published in JOSS (2018)
PerMetrics
PerMetrics: A Framework of Performance Metrics for Machine Learning Models - Published in JOSS (2024)
mlr3
mlr3: A modern object-oriented machine learning framework in R - Published in JOSS (2019)
TAXPASTA
TAXPASTA: TAXonomic Profile Aggregation and STAndardisation - Published in JOSS (2023)
seesus
seesus: a social, environmental, and economic sustainability classifier for Python - Published in JOSS (2024)
mcboost
mcboost: Multi-Calibration Boosting for R - Published in JOSS (2021)
pactus
pactus: A Python framework for trajectory classification - Published in JOSS (2023)
BetaML
BetaML: The Beta Machine Learning Toolkit, a self-contained repository of Machine Learning algorithms in Julia - Published in JOSS (2021)
Scikit-Longitudinal
Scikit-Longitudinal: A Machine Learning Library for Longitudinal Classification in Python - Published in JOSS (2025)
Annotate-Lab
Annotate-Lab: Simplifying Image Annotation - Published in JOSS (2024)
Imbalance
Imbalance: A comprehensive multi-interface Julia toolbox to address class imbalance - Published in JOSS (2024)
treemaker
treemaker: A Python tool for constructing a Newick formatted tree from a set of classifications. - Published in JOSS (2018)
pypots
A Python toolkit/library for reality-centric machine/deep learning and data mining on partially-observed time series, including SOTA neural network models for scientific analysis tasks of imputation/classification/clustering/forecasting/anomaly detection/cleaning on incomplete industrial (irregularly-sampled) multivariate TS with NaN missing values
RENT
RENT: A Python Package for Repeated Elastic Net Feature Selection - Published in JOSS (2021)
LightTwinSVM
LightTwinSVM: A Simple and Fast Implementation of Standard Twin Support Vector Machine Classifier - Published in JOSS (2019)
flaml
A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
proglearn
NeuroData's package for exploring and using progressive learning algorithms
mlni
Machine Learning in NeuroImaging (MLNI) is a python package that performs various tasks using neuroimaging data.
ammico
AI-based Media and Misinformation Content Analysis Tool: Analyze text and images
reflame
reflame: Revolutionizing Functional Link Neural Network by Metaheuristic Optimization
sdtf
Exploring streaming options for decision trees and random forests. Based on scikit-learn fork.
inference
Turn any computer or edge device into a command center for your computer vision projects.
stars-carla-experiments
This repository analyzes driving data recorded with the Carla Simulator using the STARS framework.
mapie
A scikit-learn-compatible library for estimating prediction intervals and controlling risks, based on conformal predictions.
rastervision
An open source library and framework for deep learning on satellite and aerial imagery.
annif
Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
metacluster
MetaCluster: An Open-Source Python Library for Metaheuristic-based Clustering Problems
labelme
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
immuneml
immuneML is a platform for machine learning analysis of adaptive immune receptor repertoire data.
triton
:whale: Scripps Whale Acoustics Lab :earth_americas: Scripps Acoustic Ecology Lab - Triton with remoras in development
https://github.com/timeseriesai/tsai
Time series Timeseries Deep Learning Machine Learning Python Pytorch fastai | State-of-the-art Deep Learning library for Time Series and Sequences in Pytorch / fastai
pyani
Application and Python module for average nucleotide identity analyses of microbes.
pytextclassifier
pytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,BERT等分类模型实现,开箱即用。
taxprofiler
Highly parallelised multi-taxonomic profiling of shotgun short- and long-read metagenomic data
@stdlib/utils-native-class
Determine the specification defined classification of an object.
llm-lct-sequencing
AI Semantic Insights: LLM Toolkit for Analysing Educational Practices and Knowledge Building.
classeval
Evaluation of supervised predictions for two-class and multi-class classifiers
@stdlib/ml-incr-binary-classification
Incrementally perform binary classification using stochastic gradient descent (SGD).
x-anylabeling
Effortless data labeling with AI support from Segment Anything and other awesome models.
https://github.com/cair/pytsetlinmachine
Implements the Tsetlin Machine, Convolutional Tsetlin Machine, Regression Tsetlin Machine, Weighted Tsetlin Machine, and Embedding Tsetlin Machine, with support for continuous features, multigranularity, clause indexing, and literal budget
WaveBreaking
Detect, classify, and track Rossby Wave Breaking (RWB) in weather and climate data.
taxtriage
TaxTriage is a Nextflow workflow designed to agnostically identify and classify microbial organisms within short- or long-read metagenomic NGS data. This flexible tool was developed with various use-cases of mNGS in mind.
MLJ
MLJ: A Julia package for composable machine learning - Published in JOSS (2020)
https://github.com/csinva/tree-prompt
Tree prompting: easy-to-use scikit-learn interface for improved prompting.
psoanalysis
Numerical analysis of Particle Swarm Optimization (PSO) and numerical experiments demonstrating the practicability of the method
automlpipeline.jl
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
https://github.com/bluegreen-labs/foto
The FOTO (Fourier Transform Textural Ordination) R package.
projection-pursuit
An implementation of multivariate projection pursuit regression and univariate classification
stylest
R package for estimating speaker style distinctiveness in texts. Install it from CRAN!
catsim
A structural similarity index for binary or categorical images that works in 2D or 3D.
3d-forest
Visualization, processing and analysis of Lidar point clouds, mainly focused on forest environment. New version of 3D Forest. Process files with terabytes of data. Edit new point attributes. Simple addition of new features by plugins.
https://github.com/cair/pytsetlinmachinecuda
Massively Parallel and Asynchronous Architecture for Logic-based AI
https://github.com/gperdrizet/ensembleset
Ensemble dataset generator for tabular data prediction and modeling projects.
https://github.com/cair/pytsetlinmachineparallel
Multi-threaded implementation of the Tsetlin Machine, Convolutional Tsetlin Machine, Regression Tsetlin Machine, and Weighted Tsetlin Machine, with support for continuous features and multigranularity.
https://github.com/brucewlee/lama-music-genre-dataset
.wav files, training dataset (MFCC), and graph plots (FFTs, MFCCs, Waveforms) from Latin America, Asia, MiddleEast, and Africa
https://github.com/alibaba/alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
https://github.com/alok-ai-lab/mrep-deepinsight
Multiple Representation DeepInsight technique
plasflow
Software for prediction of plasmid sequences in metagenomic assemblies
rfUtilities
R package for random forests model selection, inference, evaluation and validation
pycoal
Python toolkit for characterizing Coal and Open-pit surface mining impacts on American Lands
oasis
A Python package for efficient evaluation based on OASIS (Optimal Asymptotic Sequential Importance Sampling).
https://github.com/ai-forever/ner-bert
BERT-NER (nert-bert) with google bert https://github.com/google-research.
https://github.com/alok-ai-lab/deepinsight3d_pkg
DeepInsight3D package to deal with multi-omics or multi-layered data
https://github.com/crypdick/precision-recall-gain
Precision-recall-gain for Python
https://github.com/bblodfon/ml-course-2022
Benchmarking ML classification models on spam dataset
PAGFL
Simultaneously identify latent group structures and estimate group-specific coefficients in (time-varying) panel data models
asaca-automatic-speech-analysis-for-cognitive-assessment
Transform speech into cognitive assessments with ASACA. Achieve accurate predictions and low error rates using our end-to-end toolkit. 🚀🔧
pyntbci
The Python Noise-Tagging Brain-Computer interface (PyntBCI) library is a Python toolbox for the noise-tagging brain-computer interfacing (BCI) project developed at the Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
https://github.com/fasttrees/fasttrees
A fast and frugal tree classifier for sklearn