BetaML
BetaML: The Beta Machine Learning Toolkit, a self-contained repository of Machine Learning algorithms in Julia - Published in JOSS (2021)
pypots
A Python toolkit/library for reality-centric machine/deep learning and data mining on partially-observed time series, including SOTA neural network models for scientific analysis tasks of imputation/classification/clustering/forecasting/anomaly detection/cleaning on incomplete industrial (irregularly-sampled) multivariate TS with NaN missing values
synthpred
A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
mifa
An R package providing multiple Imputation of covariance matrices in order to perform factor analysis.
JointAI
Joint Analysis and Imputation of generalized linear models and linear mixed models with missing values
HIBAG
R package – HLA Genotype Imputation with Attribute Bagging (development version only)
icellr
Single (i) Cell R package (iCellR) is an interactive R package to work with high-throughput single cell sequencing technologies (i.e scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST)).
ncdssm
PyTorch implementation of the NCDSSM models presented in the ICML '23 paper "Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series".
disc
A highly scalable and accurate inference of gene expression and structure for single-cell transcriptomes using semi-supervised deep learning.
saits
The official PyTorch implementation of the paper "SAITS: Self-Attention-based Imputation for Time Series". A fast and state-of-the-art (SOTA) deep-learning neural network model for efficient time-series imputation (impute multivariate incomplete time series containing NaN missing data/values with machine learning). https://arxiv.org/abs/2202.08516
https://github.com/exascaleinfolab/imputegap
ImputeGAP: A library of Imputation Techniques for Time Series Data
tsdb
a Python toolbox loads 172 public time series datasets for machine/deep learning with a single line of code. Datasets from multiple domains including healthcare, financial, power, traffic, weather, and etc.
pygrinder
PyGrinder: a Python toolkit for grinding data beans into the incomplete for real-world data simulation by introducing missing values with different missingness patterns, including MCAR (complete at random), MAR (at random), MNAR (not at random), sub sequence missing, and block missing
localFDA
Localization processes for functional data analysis. Software companion for the paper “Localization processes for functional data analysis” by Elías, A., Jiménez, R., and Yukich, J. (2020)
multimput
multimput is an R package that assists with analysing dataset with missing values using multiple imputation.
swansf-datapreprocessing-sampling-notebooks
These notebooks provide a comprehensive workflow, from start to finish, for processing and analyzing the SWAN-SF dataset. They include detailed steps for reading the dataset files, performing full preprocessing, and executing classification.
jamie
Joint variational Autoencoders for Multimodal Imputation and Embedding (JAMIE)
https://github.com/arvkevi/openhumansimputer
Imputation pipeline for Open Humans
nf-gwas
A nextflow pipeline to perform state-of-the-art genome-wide association studies.