Recent Releases of cuallee

cuallee - v0.15.2

  • Addition of is_custom support for pandas. Thank you @jkkronk

Scientific Software - Peer-reviewed - Python
Published by canimus about 1 year ago

cuallee - v0.15.1

  • Fix test cases for snowpark
  • Bump versions from dependabot

Scientific Software - Peer-reviewed - Python
Published by canimus about 1 year ago

cuallee - v0.15.0

  • New Check.ok() method to return true|false when all rules in the check ended up with PASS status
  • Upgraded to support duckdb==1.1.1
  • Better error handling for duckdb.DuckDbPyRelation object, instructing user to register their relation first
  • Introduction to pre-commit on the repository as guardrails for dev work

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.14.1

  • Bump to python >=3.10. Thanks @runkelcorey
  • Lightweight verification of compute engines

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.14.0

  • Refactor for less is_instance validation
  • Reduce -32% code complexity
  • Suggested fix is_custom for pyspark-connect

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.13.2

  • Fixes on pyspark validation dataframe. Thanks @runkelcorey
  • Bumped versions of tooling and added pre-commit as aid to development and standards

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.13.1

  • Formatting of is_custom validation. Thanks @marrov for the pointers in the discussions.

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.13.0

  • Polars offers is_unique with ignore_nulls flag to allow validation ignoring nulls defaults to false

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.12.5

  • Added check.bio.is_cds which validates codons on sequence
  • Added check.bio.is_protein support for polars and duckdb
  • Added check.bio.is_dna support for polars and duckdb

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.12.4

  • Fixed error in polars validation data frame when data had combined values. Thanks @vestalisvirginis

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.12.3

  • Quick fix to include MANIFEST.in for bio checks. Thank you @vestalisvirginis

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.12.2

  • Inclusion of bio checks
  • Added kwargs to add_rule on Check
  • Idea for LogicCheck to create test results combination and result dependency via propositional logic

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.12.1

  • Fixed dagster check utilities
  • Upgrade polars to v1.0.0 breaking changes due to count_match deprecation to count_matches
  • Mapping between Dagster AssetCheckSeverity and Cuallee CheckLevel for Warning and Error

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.11.1

  • Added is_custom check for row-level validations on pyspark based on a custom function defined by the user

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.11.0

  • Release promoted for zenodo

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.10.5

  • Preparation for Zenodo
  • Fix BQ unit test cases

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.10.4

  • Added is is_empty rule. Thanks @minzastro
  • Bump versions from duckdb and dagster

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.10.3

  • Added approximate flag into the is_complete implementation for pyspark to run comparisson with pydeequ
  • Resolved JOSS issues for documentation and references against other data quality frameworks
  • Updated the test/performance folder with recent versions of all frameworks and accurate docker containers for each test

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.10.2

  • Added documentation to main classes Check and Rule
  • Changed to base=2 the implementation of has_entropy for pyspark as it does reflect with the common uses

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.10.1

  • Upgrade to duckdb==0.10.2
  • Community guidelines in README. Thanks @devarops
  • Fix pipeline with new SF account

Scientific Software - Peer-reviewed - Python
Published by canimus over 1 year ago

cuallee - v0.10.0

  • Addition of daft data frame support. Attribution to @dsaad68 👏
  • @dsaad68 largest contribution to the project ever! 🏆
  • Thanks for covering all: test, docs and code 💯

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v0.9.2

  • Removal of deprecated sum(axis=1) in polars in favor for sum_horizontal()
  • Thanks @StuffbyYuki

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v0.9.1

  • Added support for spark-connect via SPARK_REMOTE environment variable

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v.0.9.0

  • Fix an important issue when working with datasets >1 billion rows were violations were present, and status was marked as PASS
  • Inclusion of new Controls
  • Structure for PDF report added

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v0.8.8

  • JOSS submission

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee -

  • Hot fix for pyspark on reconciliation of results. It was returning only last rule

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v0.8.6

  • Added percentage_fill for Control class and pyspark dataframes
  • Added percentage_empty for Control class and pyspark dataframes

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v0.8.0

  • Addition of cloud module to publish test results in TBD repository
  • Ideas of starting a cloud service to publish results on a regular basis
  • Evaluation of tinybird.co and keen.io as metric collectors
  • msgpack added on [cloud] dependency to pack result payloads

NOTICE

  • CUALLEE_CLOUD_HOST is the complete url to publish the test result only, including PASS/FAIL and rows. No data transmitted anywhere, simply the output of the check.validate
  • CUALLEE_CLOUD_TOKEN environment variable pointing to the publish service. This is in BETA and disabled as there is no service listener implemented

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v0.7.8

  • New check not_in thanks @maltzsama
  • fixed test cases on new check

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v0.7.7

  • Added has_cardinality check

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v0.7.5

  • Disable debugger by default on pyspark dataframe validation
  • Reactivate snowflake account for integration tests

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v0.7.4

  • Remove lxml dependency
  • Remove country list dependency handled by i18n-iso-countries library

Scientific Software - Peer-reviewed - Python
Published by canimus almost 2 years ago

cuallee - v0.7.0

  • Added Control for grouping checks or with full dataframe scope
  • Control.completeness implemented
  • Added docs for Control

Scientific Software - Peer-reviewed - Python
Published by canimus about 2 years ago

cuallee - v0.6.0

  • Completed test cases for polars and upgraded to version 0.19.6
  • Added t-minus functions
  • Implement has_workflow on polars correctly and without SQL context
  • Added a new alias for is_unique as is_primary_key
  • Added a new alias for are_unique as is_composite_key
  • Upgraded documentation

Scientific Software - Peer-reviewed - Python
Published by canimus about 2 years ago

cuallee - v0.5.2

  • Fix quoting on bigquery queries

Scientific Software - Peer-reviewed - Python
Published by canimus over 2 years ago

cuallee - v0.5.1

  • Added the is_t_minus_n core check
  • Added the is_today core check
  • Added the is_yesterday core check

Scientific Software - Peer-reviewed - Python
Published by canimus over 2 years ago

cuallee - v0.5.0

  • Fixed Big Query connection in CI/CD 🥳. Thanks @vestalisvirginis
  • Reactivate SnowFlake test account for CI/CD
  • Fix Google Auth GitHub action to allow Google Cloud default credentials
  • 838 test cases! 🏆

Scientific Software - Peer-reviewed - Python
Published by canimus over 2 years ago

cuallee - v0.4.9

  • Fixes wrong number of violations in are_complete check. Thanks @GeorgelPreput for reporting.

Scientific Software - Peer-reviewed - Python
Published by canimus over 2 years ago

cuallee - v0.4.8

  • Added flag for case sensitive column name matching. Thanks @runkelcorey
  • Added substring verification for PySpark version. Thanks @dCodeYL

Scientific Software - Peer-reviewed - Python
Published by canimus over 2 years ago

cuallee - v0.4.7

  • Fix polars validations

Scientific Software - Peer-reviewed - Python
Published by canimus over 2 years ago

cuallee - v0.4.5

  • Polars dataframe validation incorporated
  • Correction of pandas and requests requirements for iso checks

Scientific Software - Peer-reviewed - Python
Published by canimus over 2 years ago

cuallee - ISO Checks

  • Upgraded to support new pyspark 3.4.0 and pnadas 2.0.1
  • Included a new module inside the Check to enable iso checks for countries and currencies

Scientific Software - Peer-reviewed - Python
Published by canimus over 2 years ago

cuallee - 634 test cases

  • Full implementation of duckdb==0.6.0
  • Increase test coverage >90%
  • Refactor the pyproject.toml for better installation

Scientific Software - Peer-reviewed - Python
Published by canimus about 3 years ago

cuallee - New checks

  • has_workflow check on pyspark
  • New check ideas on time-series on roadmap
  • Added new docs and test cases

Scientific Software - Peer-reviewed - Python
Published by canimus about 3 years ago

cuallee - Duck DB + Test Coverage

  • Added DuckDB support
  • 100% test coverage
  • pyspark==3.3.1
  • pyarrow10.0.0

Scientific Software - Peer-reviewed - Python
Published by canimus about 3 years ago

cuallee - Snowpark + PySpark + Pandas

  • Support for snowflake-snowpark-python==0.12.0
  • Support for pyspark==3.3.0
  • Support for pandas==1.5.1
  • 200+ test cases
  • pydeequ performance test comparisson
  • Added docs

Scientific Software - Peer-reviewed - Python
Published by canimus about 3 years ago

cuallee - v0.0.3 Welcome Cuallee!

  • First round of completeness and uniqueness rules
  • Unit test cases
  • Documentation
  • README

Scientific Software - Peer-reviewed - Python
Published by canimus over 3 years ago