PyCM
PyCM: Multiclass confusion matrix library in Python - Published in JOSS (2018)
PyAutoFit
PyAutoFit: A Classy Probabilistic Programming Language for Model Composition and Fitting - Published in JOSS (2021)
scikit-posthocs
scikit-posthocs: Pairwise multiple comparison tests in Python - Published in JOSS (2019)
Visualizations with statistical details
Visualizations with statistical details: The 'ggstatsplot' approach - Published in JOSS (2021)
kalepy
kalepy: a Python package for kernel density estimation, sampling and plotting - Published in JOSS (2021)
Statmanager-kr
Statmanager-kr: A User-friendly Statistical Package for Python in Pandas - Published in JOSS (2024)
POSSA
POSSA: Power simulation for sequential analyses and multiple hypotheses - Published in JOSS (2022)
pyerrors
Error propagation and statistical analysis for Monte Carlo simulations in lattice QCD and statistical mechanics using autograd.
infomeasure
Python package for calculating various information measures, including entropy, mutual information, transfer entropy, and more, with support for both discrete and continuous variables.
cmstatr
cmstatr: An R Package for Statistical Analysis of Composite Material Data - Published in JOSS (2020)
prospector
Inspects Python source files and provides information about type and location of classes, methods etc
git-quick-stats
▁▅▆▃▅ Git quick statistics is a simple and efficient way to access various statistics in git repository.
mastercurves
Python package for automatically superimposing data sets to create a master curve, using Gaussian process regression and maximum a posteriori estimation.
micompm
micompm: A MATLAB/Octave toolbox for multivariate independent comparison of observations - Published in JOSS (2018)
https://github.com/bayer-group/pybalance
A library for minimizing the effects of confounding covariates
https://github.com/johnkerl/miller
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
https://github.com/thomaswong2022/thor-public
AutoML tools for solving Time-Varying High-Dimensional Ordinal Regression Problems
https://github.com/alan-turing-institute/thermodynamicanalyticstoolkit
Sampling-based approach to analyse neural networks using TensorFlow
cognitivemodels
Cognitive Models: Computational Modeling of Cognitive Processes with Bayesian Mixed Models in Julia
appendix_potential_covid-19_test_fraud_detection
The methods and results of the publication "Potential COVID-19 test fraud detection: Findings from a pilot study comparing conventional and statistical approaches" are described in more detail in this appendix. The R-syntax for the calculation is provided, as well as a pseudo data set with which the syntax can also be tested.
epoc-aki
📈 Evaluation of the Predictive value of short-term Oliguria and minor Creatinine increases for Acute Kidney Injury in ICU
https://github.com/atharvapathak/market_basket_analysis
This project implements Market Basket Analysis (MBA), using data mining techniques to uncover relationships between products purchased together. By analyzing transaction data, we aim to provide actionable insights to optimize marketing strategies and enhance customer experience.
jointanalysisfastingpostprandialglucoseinsulin
Code of the paper "Joint analysis of fasting and postprandial plasma glucose and insulin concentrations in Venezuelan women"
https://github.com/durgeshrathod/dataframe-statistical-analyzer
The key functionalities include summary statistics calculation, percentage change computation, outlier detection, trend analysis, moving average calculation, correlation analysis, and seasonal pattern interpretation.
remotePARTS
remotePARTS: Spatiotemporal autoregression analyses for large data sets - Published in JOSS (2025)
stata-mcp
Let LLM help you achieve your regression with Stata: AI-Powered Stata Code Generation & Regression Analysis.
scantec
Sistema Comunitário de Avaliação de modelos Numéricos de Tempo e Clima (SCANTEC)
https://github.com/hosseinmoein/dataframe
C++ DataFrame for statistical, financial, and ML analysis in modern C++
data-and-design-core
Code developed by the EFDC Data and Design Core team to support mental health research.