Scientific Software
Updated 6 months ago
cuallee
cuallee: A Python package for data quality checks across multiple DataFrame APIs - Published in JOSS (2024)
Scientific Software · Peer-reviewed
Updated 6 months ago
https://github.com/awslabs/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Updated 6 months ago
https://github.com/datafold/data-diff
Compare tables within or across databases
Updated 6 months ago
https://github.com/datacleaner/datacleaner
The premier open source Data Quality solution
Updated 6 months ago
https://github.com/autoviml/pandas_dq
Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.
Updated 6 months ago
rqssframework
The main code repository of Referencing Quality Scoring System metrics. Paper: https://www.semantic-web-journal.net/system/files/swj3593.pdf