Recent Releases of sdnist

sdnist - SDNist v2.4

What's Changed

SDNist Repo:

Replaced old sample reports with the ones generated with sdnist version 2.3. Sample reports are available in repo at path: sdnist/report/sample-reports

SDNist Library:

  • Fixed issue with deleting rows when there are unknown values.
  • Fixed sdnist crash issue when deid. data has less than 3 features.
  • Fixed issue with pca-pair plot when data has less than 5 features.

Assets from previous release - toydeidentifieddata.zip

- HTML
Published by kbtriangulum over 2 years ago

sdnist - SDNist v2.3

What's Changed

Data: - Updated INDP feature details in data_description.json file.

SDNist Library: - Added explanatory text to inconsistencies and UEM (Unique Exact Matches) metric, fixed typographic errors, improved formatting in privacy section, renamed 'k-marginal breakdown' to 'worst-performing PUMA breakdown' and adjusted json structure accordingly. - Added 1% and 5% sampling error on the k-marginal. - Improved readability of propensity image. - Fixed feature space size in UEM. - Fixed deidentified data percentage in UEM metric. - Fixed PCA plot scaling issues.

Assets from previous release - toydeidentifieddata.zip

- HTML
Published by kbtriangulum almost 3 years ago

sdnist - SDNist v2.2

What's Changed

Data: - Updated description of the feature PINCPDECILE in the datadictionary.json

SDNist Library: - Added new privacy metric unique-exact-matches to compute number of unique target data records that exactly matched records in a deidentified data. - Reduced report bundle size. - Added deidentified data labels to the generated json report. - Update to show only worst three univariates in the k-marginal breakdown section. - Fixes to the regression metric. - Few other cosmetic changes to the report UI.

Assets from previous release - toydeidentifieddata.zip

- HTML
Published by kbtriangulum about 3 years ago

sdnist - SDNist v2.1.1

What's Changed

  • Updated toy data
  • Added PCA metric highlight queries.
  • Improved stability of K-marginal scoring, and added k-marginal sub-sampling baseline for each PUMA.
  • Added support for including metadata labels in the evaluation report.
  • Enhanced data validation: does not evaluate deidentified data features that contain out of bound values.
  • Added to report the list of feature that are not included in the evaluation due to out of bound values.
  • Removed incorrect child_NOC inconsistency check.
  • Changed feature ordering of pair-wise correlation charts to match data dictionary ordering.
  • Added colors to the console output for better readability.
  • Few other cosmetic changes to the report UI.

- HTML
Published by garyhowarth about 3 years ago

sdnist - SDNist v2.0

This is the stable release of the SDNist Synthetic Report Generator and the SDNist version intended for use in the 2023 Collaborative Research cycle.

What's Changed

  • Added new metrics
  • Included multiple sample results
  • Added some more READMEs (report generator tool and sample report)
  • Added better documentation in the READMEs
  • Streamlined the repo by removing materials specific to the Temporal Map Challenge.

- HTML
Published by garyhowarth over 3 years ago

sdnist - v1.4.1-b.1: Beta release

This release uses same data resource files as release v1.4.0-b.1 Code and Readme

changes: * Fix synthetic data binning. * Fix PCA plot axes.

- HTML
Published by kbtriangulum over 3 years ago

sdnist - v1.4.0-b.2: Beta release

This release uses same resource files as release v1.4.0-b.1 Code and Readme

changes: * Fixed synthetic data validation. * Fixed apparent match metric when no matches occur. * Fixed sampling error computation to make sub-samples with same size as the target data. * Update propensity metric to use tree depth 6. * Update pearson and kendall correlation to have minimum upper range of 0.15 on colorbars. * Update k-marginal to perform 100 permutations of 3-marginal selection.

- HTML
Published by kbtriangulum almost 4 years ago

sdnist - v1.4.0-b.1: Beta test release with all data resources

- HTML
Published by kbtriangulum almost 4 years ago

sdnist - v1.3.0: Stable release with all data resouces

This release is the v1.3.0 stable release and contains all data assets.

Data files in parquet/json and csv are included. Use pip install from the unzipped package to install sdnist. Download SDNist-data-1.3.0 and run sdnist from the unzipped directory to prevent auto attempts to download the SDNist github release 1.3.0.

SDNist-data-1.3.0.zip contains all data tables in parquet/json and csv and also geojson files for census data.

v1.3.0 Chanelog: https://github.com/usnistgov/SDNist/blob/main/CHANGELOG.md Full Changelog: https://github.com/usnistgov/SDNist/compare/v0.1.2...v1.3.0

- HTML
Published by kbtriangulum about 4 years ago

sdnist - Stable release with all data resouces

This release is the first stable release and contains all data assets.

Data files in parquet/json and csv are included. Use pip install from the unzipped package to install sdnist. Download SDNist-data-1.2.01 and run sdnist from the unzipped directory to prevent attempts to download the data from NIST servers.

SDNist-data-1.2.0.zip contains all data tables in parquet/json and csv.

Full Changelog: https://github.com/usnistgov/SDNist/compare/v0.1.2...v1.2.0

- HTML
Published by garyhowarth over 4 years ago

sdnist - Basic testing complete - pre-public

The code in this release is undergoing active testing and is in pre-public status.

Note the data assets are still in v0.1.1

This release: - updates submission.py to import all necessary packages - adds loggeru as a module requirement during setup

Full Changelog: https://github.com/usnistgov/SDNist/compare/v0.1.1...v0.1.2

- HTML
Published by garyhowarth over 4 years ago

sdnist - Initial Release

This is the initial release of SDNist.

This is a fully operational Python implementation of benchmarks for data synthesizers derived from the 2020 NIST PSCR Differential Privacy Temporal Map Challenge.

This release includes: - SDNist: Benchmarks for data synthesizers (zip and tarball) - Public and test datasets - OpenPGP signature

Requirements to run: Python >= 3.6

Please contact gary.howarth@nist.gov with questions.

- HTML
Published by garyhowarth over 4 years ago