Scientific Software
Updated 10 months ago

BetaML — Peer-reviewed • Rank 10.5 • Science 95%

BetaML: The Beta Machine Learning Toolkit, a self-contained repository of Machine Learning algorithms in Julia - Published in JOSS (2021)

Updated 10 months ago

pypots • Rank 21.6 • Science 77%

A Python toolkit/library for reality-centric machine/deep learning and data mining on partially-observed time series, including SOTA neural network models for scientific analysis tasks of imputation/classification/clustering/forecasting/anomaly detection/cleaning on incomplete industrial (irregularly-sampled) multivariate TS with NaN missing values

Updated 10 months ago

synthpred • Rank 0.7 • Science 67%

A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.

Updated 9 months ago

birdie • Rank 8.6 • Science 59%

Bayesian Instrumental Regression for Disparity Estimation

Updated 9 months ago

mifa • Rank 8.0 • Science 49%

An R package providing multiple Imputation of covariance matrices in order to perform factor analysis.

Updated 9 months ago

missRanger • Rank 17.1 • Science 39%

Fast multivariate imputation by random forests.

Updated 9 months ago

mitml • Rank 15.8 • Science 36%

Tools for multiple imputation in multilevel modeling

Updated 9 months ago

HIBAG • Rank 14.7 • Science 26%

R package – HLA Genotype Imputation with Attribute Bagging (development version only)

Updated 9 months ago

icellr • Rank 16.2 • Science 23%

Single (i) Cell R package (iCellR) is an interactive R package to work with high-throughput single cell sequencing technologies (i.e scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST)).

Updated 9 months ago

datawig • Rank 13.8 • Science 23%

Imputation of missing values in tables.

Updated 10 months ago

ncdssm • Rank 4.0 • Science 28%

PyTorch implementation of the NCDSSM models presented in the ICML '23 paper "Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series".

Updated 9 months ago

disc • Rank 8.3 • Science 23%

A highly scalable and accurate inference of gene expression and structure for single-cell transcriptomes using semi-supervised deep learning.

Updated 9 months ago

yaImpute • Rank 12.4 • Science 13%

Nearest neighbor-based imputation on multivariate data

Updated 9 months ago

RfEmpImp • Rank 7.5 • Science 10%

Multiple Imputation using Chained Random Forests

Updated 10 months ago

trefle • Science 23%

Imputing the mammalian virome with the LF-SVD model

Updated 9 months ago

localFDA • Science 10%

Localization processes for functional data analysis. Software companion for the paper “Localization processes for functional data analysis” by Elías, A., Jiménez, R., and Yukich, J. (2020)

Updated 10 months ago

jamie • Science 36%

Joint variational Autoencoders for Multimodal Imputation and Embedding (JAMIE)

Updated 9 months ago

https://github.com/exascaleinfolab/imputegap • Science 36%

ImputeGAP: A library of Imputation Techniques for Time Series Data

Updated 10 months ago

tsdb • Science 67%

a Python toolbox loads 172 public time series datasets for machine/deep learning with a single line of code. Datasets from multiple domains including healthcare, financial, power, traffic, weather, and etc.

Updated 10 months ago

phaseimpute • Science 44%

A bioinformatics pipeline to phase and impute genetic data

Updated 10 months ago

pygrinder • Science 67%

PyGrinder: a Python toolkit for grinding data beans into the incomplete for real-world data simulation by introducing missing values with different missingness patterns, including MCAR (complete at random), MAR (at random), MNAR (not at random), sub sequence missing, and block missing

Updated 10 months ago

multimput • Science 67%

multimput is an R package that assists with analysing dataset with missing values using multiple imputation.

Updated 10 months ago

swansf-datapreprocessing-sampling-notebooks • Science 57%

These notebooks provide a comprehensive workflow, from start to finish, for processing and analyzing the SWAN-SF dataset. They include detailed steps for reading the dataset files, performing full preprocessing, and executing classification.

Updated 10 months ago

nf-gwas • Science 65%

A nextflow pipeline to perform state-of-the-art genome-wide association studies.

Updated 10 months ago

saits • Science 67%

The official PyTorch implementation of the paper "SAITS: Self-Attention-based Imputation for Time Series". A fast and state-of-the-art (SOTA) deep-learning neural network model for efficient time-series imputation (impute multivariate incomplete time series containing NaN missing data/values with machine learning). https://arxiv.org/abs/2202.08516