Recent Releases of https://github.com/awslabs/deequ
https://github.com/awslabs/deequ - 2.0.12
What's Changed
Added Implementation of DQDL Rules and Execution
- add implementation of DQDL rule execution by @happy-coral in https://github.com/awslabs/deequ/pull/620
- Add implementation of outcome mapping in DeequOutcomeTranslator by @happy-coral in https://github.com/awslabs/deequ/pull/621
- Add implementation for DQDL rules: CompletenessRule, IsCompleteRule, UniquenessRule, IsUniqueRule, ColumnCorrelationRule by @happy-coral in https://github.com/awslabs/deequ/pull/622
- Add implementation for DQDL rules: DistinctValuesCount, Entropy, Mean, StandardDeviation, Sum, UniqueValueRatio by @happy-coral in https://github.com/awslabs/deequ/pull/624
- Update README to describe DQDL support and add Java & Scala DQDL examples by @happy-coral in https://github.com/awslabs/deequ/pull/634
- Add support for DQDL IsPrimaryKey rule by @happy-coral in https://github.com/awslabs/deequ/pull/635
- Add support for DQDL ColumnLength rule by @eycho-am in https://github.com/awslabs/deequ/pull/636
Modify Histogram to be in descending frequency by @kyraman in https://github.com/awslabs/deequ/pull/630
Introduce HistogramBase for common histogram behavior by @kyraman in https://github.com/awslabs/deequ/pull/631
Modify maven publishing to use central portal by @eycho-am in https://github.com/awslabs/deequ/pull/633
Add support for DQDL CustomSql rule & Deequ CustomSql check by @happy-coral in https://github.com/awslabs/deequ/pull/632
fix(kll): Add SerDe Implementation for KLLSketch by @mdrakiburrahman in https://github.com/awslabs/deequ/pull/628
Updated version in pom.xml to 2.0.12-spark-3.5 by @eycho-am in https://github.com/awslabs/deequ/pull/637
New Contributors
- @kyraman made their first contribution in https://github.com/awslabs/deequ/pull/630
- @mdrakiburrahman made their first contribution in https://github.com/awslabs/deequ/pull/628
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.11...2.0.12
- Scala
Published by eycho-am 6 months ago
https://github.com/awslabs/deequ - 2.0.11
What's Changed
- Add AnalyzerOptions to Analyzer serialize / deserialize logic by @kchaturvedi in https://github.com/awslabs/deequ/pull/597
- Refine row count retrieval to skip redundant Size() scans by @lawofcycles in https://github.com/awslabs/deequ/pull/605
- Updated version in pom.xml to 2.0.11-spark-3.5 by @eycho-am in https://github.com/awslabs/deequ/pull/615
New Contributors
- @kchaturvedi made their first contribution in https://github.com/awslabs/deequ/pull/597
- @lawofcycles made their first contribution in https://github.com/awslabs/deequ/pull/605
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.10...2.0.11
- Scala
Published by eycho-am 6 months ago
https://github.com/awslabs/deequ - 2.0.10
New Features
- Are unique check by @eycho-am in https://github.com/awslabs/deequ/pull/599
- add DQDL parser dependency by @happy-coral in https://github.com/awslabs/deequ/pull/603
- scaffolding for checking data quality agains DQDL rulesets by @happy-coral in https://github.com/awslabs/deequ/pull/604
- Implement translation of rules and add converter for RowCount rule by @happy-coral in https://github.com/awslabs/deequ/pull/606
Maintenance / Fixes
- feature/replace-rdd by @shriyavanvari in https://github.com/awslabs/deequ/pull/586
- Adds a test to verify that Deequ's isContainedIn constraint correctly handles string values containing single quotes in the verification process. by @D-Minor in https://github.com/awslabs/deequ/pull/602
New Contributors
- @shriyavanvari made their first contribution in https://github.com/awslabs/deequ/pull/586
- @D-Minor made their first contribution in https://github.com/awslabs/deequ/pull/602
- @happy-coral made their first contribution in https://github.com/awslabs/deequ/pull/603
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.9...2.0.10
- Scala
Published by eycho-am 10 months ago
https://github.com/awslabs/deequ - 2.0.9
Maintenance / Fixes
- Fix row level bug when composing outcome https://github.com/awslabs/deequ/pull/594
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.8...2.0.9
- Scala
Published by eycho-am 10 months ago
https://github.com/awslabs/deequ - 2.0.8
New Features
- Configurable RetainCompletenessRule by @zeotuan in https://github.com/awslabs/deequ/pull/564
- Optional specification of instance name in CustomSQL analyzer metric. by @tylermcdaniel0 in https://github.com/awslabs/deequ/pull/569
- Adding Wilson Score Confidence Interval Strategy by @zeotuan in https://github.com/awslabs/deequ/pull/567
- CustomAggregator by @joshuazexter in https://github.com/awslabs/deequ/pull/572
- Add commits from master branch to release/2.0.8-spark-3.5 by @eycho-am in https://github.com/awslabs/deequ/pull/587
Maintenance / Fixes
- fix typo by @bojackli in https://github.com/awslabs/deequ/pull/574
- Fix performance of building row-level results by @marcantony in https://github.com/awslabs/deequ/pull/577
New Contributors
- @joshuazexter made their first contribution in https://github.com/awslabs/deequ/pull/572
- @bojackli made their first contribution in https://github.com/awslabs/deequ/pull/574
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.7...2.0.8
- Scala
Published by eycho-am 10 months ago
https://github.com/awslabs/deequ - 2.0.7
What's Changed
Upgrades
- Add Spark 3.5 support by @jhchee in https://github.com/awslabs/deequ/pull/514
New Features
- New type of MetricsRepository by @VenkataKarthikP:
- Using Spark tables as the data source in https://github.com/awslabs/deequ/pull/518
- Row Level Result Treatment Options by @eycho-am:
- Uniqueness and Completeness in https://github.com/awslabs/deequ/pull/532
- Miminum and Maximum in https://github.com/awslabs/deequ/pull/535
- Anomaly Detection Changes by @zeotuan:
- Add Daily Season with Hourly Interval to HoltWinter in https://github.com/awslabs/deequ/pull/546
- New analyzers:
- RatioOfSums by @scott-gunn in https://github.com/awslabs/deequ/pull/552
- Column Count Analyzer and Check by @mentekid in https://github.com/awslabs/deequ/pull/555
Maintenance/Fixes
- Fix Breeze dependency conflict in Anomaly Detection Spark 3.4+ by @zeotuan in https://github.com/awslabs/deequ/pull/545
- Data Sync / DatasetMatch changes by @VenkataKarthikP:
- add data synchronization test to verification Suite in https://github.com/awslabs/deequ/pull/526
- support col match and change to DatasetMatch in https://github.com/awslabs/deequ/pull/529
- Row level results fixes:
- Add analyzerOption to add filteredRowOutcome for isPrimaryKey Check by @eycho-am in https://github.com/awslabs/deequ/pull/537
- Fix bug in MinLength and MaxLength when NullBehavior.EmptyString by @eycho-am in https://github.com/awslabs/deequ/pull/538
- [Min/Max] Apply filtered row behavior at the row level evaluation by @rdsharma26 in https://github.com/awslabs/deequ/pull/543
- [MinLength/MaxLength] Apply filtered row behavior at the row level evaluation by @rdsharma26 in https://github.com/awslabs/deequ/pull/547
- Fix for satisfies row level results bug by @rdsharma26 in https://github.com/awslabs/deequ/pull/553
New Contributors
- @VenkataKarthikP made their first contribution in https://github.com/awslabs/deequ/pull/518
- @scott-gunn made their first contribution in https://github.com/awslabs/deequ/pull/552
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.6...2.0.7
- Scala
Published by rdsharma26 over 1 year ago
https://github.com/awslabs/deequ - 2.0.6
What's Changed
- NEW: Exact Quantile Check
- Creation of Exact Quantile Check by @jmilis2000 in https://github.com/awslabs/deequ/pull/512
- Data Synchronization/Matching fixes
- Delegate to Spark for checking existence of columns in the given dataframes by @rdsharma26 in https://github.com/awslabs/deequ/pull/515
- Verify that non key columns exist in each dataset by @rdsharma26 in https://github.com/awslabs/deequ/pull/517
- Addition of tests
- Test that exceptions within a check's constraints do not affect other… by @tylermcdaniel0 in https://github.com/awslabs/deequ/pull/516
New Contributors
- @jmilis2000 made their first contribution in https://github.com/awslabs/deequ/pull/512
- @tylermcdaniel0 made their first contribution in https://github.com/awslabs/deequ/pull/516
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.5...2.0.6
- Scala
Published by rdsharma26 over 2 years ago
https://github.com/awslabs/deequ - 2.0.5
What's Changed
- Spark 3.4 Update
- Add Spark 3.4 support by @jhchee in https://github.com/awslabs/deequ/pull/505
- Update minor version for Spark 3.4 maven release by @eycho-am in https://github.com/awslabs/deequ/pull/513
- NEW: Custom SQL analyzer
- Custom SQL Analyzer by @mentekid in https://github.com/awslabs/deequ/pull/509
- Fail when CustomSql has syntax errors by @mentekid in https://github.com/awslabs/deequ/pull/510
- Fix CustomSQL test syntax by @eycho-am in https://github.com/awslabs/deequ/pull/511
- Analyzer Improvements
- Allow all DQ constraints to be generated from an Analyzer by @mentekid in https://github.com/awslabs/deequ/pull/508
New Contributors
- @jhchee made their first contribution in https://github.com/awslabs/deequ/pull/505
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.4...2.0.5
- Scala
Published by rdsharma26 over 2 years ago
https://github.com/awslabs/deequ - 2.0.4
What's Changed
- Row-Level Results:
- MinLength by @eycho-am in https://github.com/awslabs/deequ/pull/465
- Uniqueness by @eycho-am in https://github.com/awslabs/deequ/pull/471
- ColumnValues by @zixianzh1 in https://github.com/awslabs/deequ/pull/476
- ReferentialIntegrity by @rdsharma26 in https://github.com/awslabs/deequ/pull/466
- [Experimental] DataSynchronization by @rdsharma26 in https://github.com/awslabs/deequ/pull/473
- Referential Integrity:
- Updated Referential Integrity to support multiple columns by @rdsharma26 in https://github.com/awslabs/deequ/pull/463
- Constraints and Condition Changes:
- Add population stability index (PSI) to distance methods by @bevhanno in https://github.com/awslabs/deequ/pull/480
- Fix chi-square test conditions by @bevhanno in https://github.com/awslabs/deequ/pull/482
- Missing Column Precondition for Compliance Check - issue fix 467 by @samarth-c1 in https://github.com/awslabs/deequ/pull/478
- Addition of HasMax/HasMin/HasStandardDeviation/HasMean constraint suggestions by @rdsharma26 in https://github.com/awslabs/deequ/pull/489
- Alternative aggregate functions to calculate histogram values. by @akalotkin in https://github.com/awslabs/deequ/pull/475
New Contributors
- @zixianzh1 made their first contribution in https://github.com/awslabs/deequ/pull/476
- @samarth-c1 made their first contribution in https://github.com/awslabs/deequ/pull/478
- @akalotkin made their first contribution in https://github.com/awslabs/deequ/pull/475
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.3...2.0.4
- Scala
Published by eycho-am over 2 years ago
https://github.com/awslabs/deequ - 2.0.3
What's Changed
- Adding chi-square distance method for categorical variables by @bevhanno in https://github.com/awslabs/deequ/pull/444
- [WIP] Row Level Results by @mentekid in https://github.com/awslabs/deequ/pull/451
- [Experimental] Addition of dataset comparison utilities by @rdsharma26 in https://github.com/awslabs/deequ/pull/449
New Contributors
- @rdsharma26 made their first contribution in https://github.com/awslabs/deequ/pull/447
- @bevhanno made their first contribution in https://github.com/awslabs/deequ/pull/444
- @mentekid made their first contribution in https://github.com/awslabs/deequ/pull/451
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.2...2.0.3
- Scala
Published by eycho-am almost 3 years ago
https://github.com/awslabs/deequ - 2.0.2
Adds Spark 3.3 compatibility.
What's Changed
- Upgrade to Spark 3.3.0 by @eycho-am in https://github.com/awslabs/deequ/pull/442
New Contributors
- @eycho-am made their first contribution in https://github.com/awslabs/deequ/pull/442
Full Changelog: https://github.com/awslabs/deequ/compare/2.0.1...2.0.2
- Scala
Published by shehzad-qureshi about 3 years ago
https://github.com/awslabs/deequ - 2.0.1
Adds Spark 3.2 compatibility.
- Scala
Published by TammoR about 4 years ago
https://github.com/awslabs/deequ - 2.0.0
Add Spark 3.1 compatibility.
Note: this version is no longer compatible with Spark <=3.0. Use previous versions and branch legacy-spark-3.0 instead.
- Scala
Published by lange-labs over 4 years ago
https://github.com/awslabs/deequ - Fix build setup to make artefact importable with maven/sbt
This release updates the build setup (i.e. the pom.xml and the publishing process) so that the artefacts published to maven can now be imported using maven or sbt. There are four branches associated with this new release: - for spark 2.2: https://github.com/awslabs/deequ/tree/release/1.2.2-spark-2.2 - for spark 2.3: https://github.com/awslabs/deequ/tree/release/1.2.2-spark-2.3 - for spark 2.4: https://github.com/awslabs/deequ/tree/release/1.2.2-spark-2.4 - for spark 2.5: https://github.com/awslabs/deequ/tree/release/1.2.2-spark-2.5
- Scala
Published by twollnik almost 5 years ago
https://github.com/awslabs/deequ - 1.1.0
Changes to the build setup to support Spark 2.2.x to 2.4.x and 3.0.x. There now is one maven release available per Spark version: - spark-3.0-scala-2.12 - spark-2.4-scala-2.11 - spark-2.3-scala-2.11 - spark-2.2-scala-2.11
- Scala
Published by twollnik about 5 years ago
https://github.com/awslabs/deequ - 1.0.4
Correct version in pom.xml
- Scala
Published by tdhd over 5 years ago
https://github.com/awslabs/deequ - 1.0.3
- Histogram metrics backwards compatability
- support for Spark SQL case sensitivity
- several bug fixes
- added documentation
- Scala
Published by tdhd over 5 years ago
https://github.com/awslabs/deequ - 1.0.1
- Spark 2.4 compatibility
- Scala
Published by iamsteps almost 7 years ago
https://github.com/awslabs/deequ - 1.0.0-rc5
- Check-applicability result now contains all constraints and their applicabilities
- Include metric in
ConstraintResult
https://github.com/awslabs/deequ/pull/76
- Scala
Published by tdhd over 7 years ago
https://github.com/awslabs/deequ - 1.0.0-rc4
- Anomaly detection with seasonal Holt Winters method
- Check applicability support for additional data types
- Scala
Published by tdhd over 7 years ago
https://github.com/awslabs/deequ - 1.0.0-rc3
Column profiling handles boolean histograms correctly
- Scala
Published by tdhd over 7 years ago
https://github.com/awslabs/deequ - 1.0.0-rc2
Spark 2.3 compatibility
- Scala
Published by sscdotopen over 7 years ago
https://github.com/awslabs/deequ - 1.0.0-rc1
Additional few convenience functions for our API.
- Scala
Published by sscdotopen over 7 years ago
https://github.com/awslabs/deequ - 1.0.0-RC0
Release candidate for deequ 1.0.
- Scala
Published by sscdotopen over 7 years ago
https://github.com/awslabs/deequ -
Test release for validating maven publishing. DONT USE IN PRODUCTION.
- Scala
Published by sscdotopen over 7 years ago