Scientific Software
Updated 6 months ago

cuallee — Peer-reviewed • Rank 18.9 • Science 98%

cuallee: A Python package for data quality checks across multiple DataFrame APIs - Published in JOSS (2024)

Scientific Software
Updated 6 months ago

daiquiri — Peer-reviewed • Rank 11.5 • Science 95%

daiquiri: Data Quality Reporting for Temporal Datasets - Published in JOSS (2022)

Updated 6 months ago

data-imputation-paper • Rank 3.5 • Science 54%

Research code for the paper "A Benchmark for Data Imputation Methods".

Updated 6 months ago

thetis • Rank 8.3 • Science 44%

Service to examine data processing pipelines (e.g., machine learning or deep learning pipelines) for uncertainty consistency (calibration), fairness, and other safety-relevant aspects.

Updated 5 months ago

eHDPrep • Rank 8.3 • Science 36%

Electronic Health Data Preparation (eHDPrep) R package

Updated 5 months ago

https://github.com/anerv/bikedna_analysis • Rank 0.7 • Science 36%

Code for analyzing the results from running BikeDNA BIG (https://github.com/anerv/BikeDNA_BIG) on bicycle infrastructure data from Denmark.

Updated 5 months ago

https://github.com/arbaznazir/datalineagepy • Rank 5.0 • Science 26%

86% faster data lineage tracking for pandas DataFrames with zero infrastructure. Real-time monitoring, ML anomaly detection, and enterprise compliance features.

Updated 5 months ago

https://github.com/whylabs/whylogs • Rank 13.4 • Science 13%

An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈

Updated 6 months ago

comp-unstructured-data • Science 44%

Scripts to explore the conditions that determine the reliability of models, trends and status by comparing aggregated cubes with structured monitoring schemes

Updated 6 months ago

3dcap-md-gen • Science 49%

Scripts for exporting scanning metadata as described in the publication "Metadata Schema and Ontology for Archaeological Object Documentation including 3D Imaging (AOD-3DI)"

Updated 5 months ago

https://github.com/nagapv/edexplore • Science 13%

A simple widget for interactive EDA / QA. Works on top of Pandas [in Jupyter Notebook] using IPyWidgets with a sprinkle of Regex.

Updated 6 months ago

pydvl • Science 36%

pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation

Updated 5 months ago

https://github.com/calgo-lab/tab_err • Science 26%

Fully-controlled realistic error generation for tabular data.

Updated 6 months ago

cvddqchecker • Science 44%

CvdDqChecker: A Software Solution for Explainable and Traceable Assessments of Cardiovascular Disease Data Quality

Updated 6 months ago

syndat • Science 75%

Synthetic data quality evaluation & visualization

Updated 6 months ago

rsa-unstructured-data-comp • Science 26%

Scripts that compare aggregated cubes with structured monitoring schemes in South Africa