VeridicalFlow
VeridicalFlow: a Python package for building trustworthy data science pipelines with PCS - Published in JOSS (2022)
toughio
toughio: Pre- and post-processing Python library for TOUGH - Published in JOSS (2020)
contextualspellcheck
✔️Contextual word checker for better suggestions (not actively maintained)
pyprep
PyPREP: A Python implementation of the Preprocessing Pipeline (PREP) for EEG data
boxsers
Python package that provides a full range of functionality to process and analyze vibrational spectra (Raman, SERS, FTIR, etc.).
english-text-normalization
Command-line interface (CLI) and library to normalize English texts.
cleaned-swansf-dataset
The SWAN-SF dataset is now fully preprocessed, optimized, and ready for binary classification tasks. Our team is excited to release the enhanced version of the SWAN-SF dataset across all five partitions.
semantic-outlier-removal
Code and data for SORE (ACL 2025), a semantic boilerplate remover.
ea34568e-d86e-4720-be2f-3f826f66a26c
Describing a pipeline to preprocess NOAA gridded rainfall reanalysis dataset
portagetextprocessing
Text processing tools that came out of the Portage SMT project — Outils de traitement de texte issus du projet Portage de TAS
qub-hri
Preprocessing Repository of QUB-Perception of Human Enagagement in Assembly Operations Dataset
nifreeze
A flexible framework for volume-wise artifact estimation and correction across multiple 4D neuroimaging modalities (diffusion MRI, functional MRI, and PET)
pybear
pybear is a Python computing library that augments data analytics functionality found in popular packages that use the scikit-learn API, such as scikit-learn and xgboost.
tubecleanr
(Mini) R package for preprocessing YouTube comment data collected with tuber or vosonSML