PyCM
PyCM: Multiclass confusion matrix library in Python - Published in JOSS (2018)
uravu
uravu: Making Bayesian modelling easy(er) - Published in JOSS (2020)
THzTools
THzTools: data analysis software for terahertz time-domain spectroscopy - Published in JOSS (2024)
PyVBMC
PyVBMC: Efficient Bayesian inference in Python - Published in JOSS (2023)
ChainoPy
ChainoPy: A Python Library for Discrete Time Markov Chain Based Stochastic Analysis - Published in JOSS (2024)
PII-Codex
PII-Codex: a Python library for PII detection, categorization, and severity assessment - Published in JOSS (2023)
spiketools
spiketools: a Python package for analyzing single-unit neural activity - Published in JOSS (2023)
nimCSO
nimCSO: A Nim package for Compositional Space Optimization - Published in JOSS (2024)
pypots
A Python toolkit/library for reality-centric machine/deep learning and data mining on partially-observed time series, including SOTA neural network models for scientific analysis tasks of imputation/classification/clustering/forecasting/anomaly detection/cleaning on incomplete industrial (irregularly-sampled) multivariate TS with NaN missing values
statsmodels
Statsmodels: statistical modeling and econometrics in Python
pygwalker
PyGWalker: Turn your dataframe into an interactive UI for visual analysis
iris
A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data
pyerrors
Error propagation and statistical analysis for Monte Carlo simulations in lattice QCD and statistical mechanics using autograd.
pydata-wrangler
Wrangle messy numerical, image, and text data into consistent well-organized formats
collapse
Advanced and Fast Data Transformation in R
chip-atlas
ChIP-Atlas: Browse and analyze all public ChIP/DNase-seq data on your browser
spectrochempy
SpectroChemPy is a framework for processing, analyzing and modeling spectroscopic data for chemistry with Python
dimensio
Multivariate Data Analysis - :exclamation: This is a read-only mirror from https://codeberg.org/tesselle/dimensio
pandas-ai
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
DIVAnd
DIVAnd performs an n-dimensional variational analysis of arbitrarily located observations
pygmmis
Gaussian mixture model for incomplete (missing or truncated) and noisy data
aspecd
Python framework for handling spectroscopic data focussing on reproducibility
cloupy
CLOUPY IS NO LONGER SUPPORTED. PLEASE, SEE README. cloupy is a Python library for climatological data downloading, processing and visualizing. The main goal of the library is to help its author in writing a BA thesis. The library is well adapted to academic work - used data sources are reliable and graphs are easy to modify.
pyprobables
Probabilistic data structures in python http://pyprobables.readthedocs.io/en/latest/index.html
interactive_data_editor
A Software to interactively edit data in a graphical manner
public_open_source_data_science
A repository of open source data science projects for social good
qpcr
A python package to analyse qPCR data for single-use or high-throughput application
mastercurves
Python package for automatically superimposing data sets to create a master curve, using Gaussian process regression and maximum a posteriori estimation.
epiphyte
Python toolkit for working with high-dimensional neural data recorded during naturalistic, continuous stimuli @a-darcher @rachrapp
bluecloud-plankton
Spatial interpolation of plankton data using a neural network
vulpes
Vulpes: Test many classification, regression models and clustering algorithms to see which one is most suitable for your dataset
fast_dash
Turn your Python functions into interactive apps! Fast Dash is an innovative way to deploy your Python code as interactive web apps with minimal changes.
credit-card-prediction
💳 This repository focuses on building a predictive model to assess the likelihood of credit card defaults. The project includes data analysis, feature engineering, and machine learning to provide accurate default predictions.
astraeus-io
Astraeus is a general-purpose tool that manages your data using Xarray structures and reads/writes data from/to your exoplanet data reduction pipeline.
quantitext
Official repository for QuantiText applications in the .NET ecosystem.
loris
Loris: Database and Analysis application for a Drosophila Lab (or any lab)
scikit-plots
An intuitive library that seamlessly adds plotting capabilities and functionality to any model objects or outputs, compatible with tools like scikit-learn, XGBoost, TensorFlow, and more.
sus-analysis-toolkit
A web-based analysis toolkit for the System Usability Scale providing calculation, plotting, interpretation and contextualization utility.
Pore2Chip
Pore2Chip: All-in-one python tool for soil microstructure analysis and micromodel design - Published in JOSS (2025)
conezen
A Python toolkit for computational chemists to analyze and visualize conical intersection topology. It transforms raw quantum chemistry output into intuitive 3D potential energy surfaces to help predict the outcomes of photochemical reactions. Licensed under GPL-3.0.
orange-story-navigator
Add-on to the Orange3 data mining toolkit with text processing widgets from the project Navigating Stories
rotational-ksz-macsis
Repository for suppelementary material from my publication on the rotational kinetic SZ effect in MACSIS
cholera-disease-analysis
The repository shows the data analysis of Cholera disease which is killing people for last two centuries using R programming language.
eye-tracking-tools-influence
Replication package, supplementary materials, and analysis pipeline for our ESEM'24 study.
stacie
Stable AutoCorrelation Integral Estimator
https://github.com/benweare/em_scripts
Electron microscopy scripts not associated with any specific publications.
ai-commercial-decisionmaking
AI-Driven Large Dataset Analysis & Commercial Decision-Making: Research on predictive analytics, machine learning strategies, and real-world business applications [Python, TensorFlow, PyTorch] 🤖📊
active-time-hypothesis
Reimagining Quantum Non-Locality: Simulating Entangled Systems with the Active Time Hypothesis
extra-dimensional-muon-anomalies
Quantum Simulation of Extra-Dimensional Contributions to Muon Anomalies
emri_signal_detection
Detecting GW signals from extreme mass ratio inspirals using convolutional autoencoders
ziggy
Ziggy, a portable, scalable infrastructure for science data processing pipelines, is the child of the Transiting Exoplanet Survey Satellite (TESS) pipeline and the grandchild of the Kepler Pipeline.
azimuth
Helping AI practitioners better understand their datasets and models in text classification. From ServiceNow.
fieldstack
Reusable R notebooks, scripts, and tools for applied data work and evaluation — built for use in the field across health, gender, climate, and education programs.
data-and-design-core
Code developed by the EFDC Data and Design Core team to support mental health research.
rath
Next generation of automated data exploratory analysis and visualization platform.
2023-10-22-carpentry-social-science
Go to https://dcs-training.github.io/2023-10-22-Carpentry-Social-Science/ to follow along the material
metaboheatmap
A R/Shiny based app for visualizing metabolomics data through heatmaps
ramanspy
RamanSPy: An open-source Python package for integrative Raman spectroscopy data analysis
port-of-mars-analysis
R analysis pipeline for Port of Mars "Mars Madness" tournaments
sp2024-election-analysis
📊 An analysis of voting patterns in São Paulo's 2024 elections, focusing on voter behavior, absenteeism, and geographic trends.
nasdaq-data-link-mcp-os
A Nasdaq Data Link MCP (Model Context Protocol) Server
na-memd-and-memd
(Noise-Assisted) Multivariate Empirical Mode Decomposition (NA-MEMD/MEMD) Decomposition Algorithm in Python
asc-analysis
DH Scraping and Analyzing the ASC database with Jupyter Notebooks
james-s-eve-dimming-index-jedi
Detect and characterize coronal dimming in the Solar Dynamics Observatory Extreme Ultraviolet Variability Experiment data