Scientific Software
Updated 6 months ago

BetaML — Peer-reviewed • Rank 10.5 • Science 95%

BetaML: The Beta Machine Learning Toolkit, a self-contained repository of Machine Learning algorithms in Julia - Published in JOSS (2021)

Updated 6 months ago

pypots • Rank 21.6 • Science 77%

A Python toolkit/library for reality-centric machine/deep learning and data mining on partially-observed time series, including SOTA neural network models for scientific analysis tasks of imputation/classification/clustering/forecasting/anomaly detection/cleaning on incomplete industrial (irregularly-sampled) multivariate TS with NaN missing values

Updated 6 months ago

synthpred • Rank 0.7 • Science 67%

A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.

Updated 6 months ago

birdie • Rank 8.6 • Science 59%

Bayesian Instrumental Regression for Disparity Estimation

Updated 6 months ago

mifa • Rank 8.0 • Science 49%

An R package providing multiple Imputation of covariance matrices in order to perform factor analysis.

Updated 6 months ago

missRanger • Rank 17.1 • Science 39%

Fast multivariate imputation by random forests.

Updated 6 months ago

mitml • Rank 15.8 • Science 36%

Tools for multiple imputation in multilevel modeling

Updated 6 months ago

HIBAG • Rank 14.7 • Science 26%

R package – HLA Genotype Imputation with Attribute Bagging (development version only)

Updated 6 months ago

icellr • Rank 16.2 • Science 23%

Single (i) Cell R package (iCellR) is an interactive R package to work with high-throughput single cell sequencing technologies (i.e scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST)).

Updated 6 months ago

datawig • Rank 13.8 • Science 23%

Imputation of missing values in tables.

Updated 6 months ago

ncdssm • Rank 4.0 • Science 28%

PyTorch implementation of the NCDSSM models presented in the ICML '23 paper "Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series".

Updated 6 months ago

disc • Rank 8.3 • Science 23%

A highly scalable and accurate inference of gene expression and structure for single-cell transcriptomes using semi-supervised deep learning.

Updated 6 months ago

yaImpute • Rank 12.4 • Science 13%

Nearest neighbor-based imputation on multivariate data

Updated 6 months ago

RfEmpImp • Rank 7.5 • Science 10%

Multiple Imputation using Chained Random Forests

Updated 6 months ago

saits • Science 67%

The official PyTorch implementation of the paper "SAITS: Self-Attention-based Imputation for Time Series". A fast and state-of-the-art (SOTA) deep-learning neural network model for efficient time-series imputation (impute multivariate incomplete time series containing NaN missing data/values with machine learning). https://arxiv.org/abs/2202.08516

Updated 6 months ago

https://github.com/exascaleinfolab/imputegap • Science 36%

ImputeGAP: A library of Imputation Techniques for Time Series Data

Updated 6 months ago

tsdb • Science 67%

a Python toolbox loads 172 public time series datasets for machine/deep learning with a single line of code. Datasets from multiple domains including healthcare, financial, power, traffic, weather, and etc.

Updated 6 months ago

pygrinder • Science 67%

PyGrinder: a Python toolkit for grinding data beans into the incomplete for real-world data simulation by introducing missing values with different missingness patterns, including MCAR (complete at random), MAR (at random), MNAR (not at random), sub sequence missing, and block missing

Updated 6 months ago

localFDA • Science 10%

Localization processes for functional data analysis. Software companion for the paper “Localization processes for functional data analysis” by Elías, A., Jiménez, R., and Yukich, J. (2020)

Updated 6 months ago

multimput • Science 67%

multimput is an R package that assists with analysing dataset with missing values using multiple imputation.

Updated 6 months ago

swansf-datapreprocessing-sampling-notebooks • Science 57%

These notebooks provide a comprehensive workflow, from start to finish, for processing and analyzing the SWAN-SF dataset. They include detailed steps for reading the dataset files, performing full preprocessing, and executing classification.

Updated 6 months ago

jamie • Science 36%

Joint variational Autoencoders for Multimodal Imputation and Embedding (JAMIE)

Updated 6 months ago

phaseimpute • Science 44%

A bioinformatics pipeline to phase and impute genetic data

Updated 6 months ago

trefle • Science 23%

Imputing the mammalian virome with the LF-SVD model

Updated 6 months ago

nf-gwas • Science 65%

A nextflow pipeline to perform state-of-the-art genome-wide association studies.