Skip to content

v1.8.0 - 2023-12-05

Compare
Choose a tag to compare
@amontanez24 amontanez24 released this 05 Dec 18:42
· 147 commits to main since this release

This release adds support for the new Diagnostic Report from SDMetrics. This report calculates scores for three basic but important properties of your data: data validity, data structure and in the multi table case, relationship validity. Data validity checks that the columns of your data are valid (eg. correct range or values). Data structure makes sure the synthetic data has the correct columns. Relationship validity checks to make sure key references are correct and the cardinality is within ranges seen in the real data.

Additionally, a few bugs were fixed and functionality was improved around synthesizers. It is now possible to access the loss values for the TVAESynthesizer and CTGANSynthesizer by using the get_loss_values method. The get_parameters method is now more detailed and returns all the parameters used to make a synthesizer. The metadata is now capable of detecting some common pii sdtypes. Finally, a bug that made every parent row generated by the HMASynthesizer have at least one child row was patched. This should improve cardinality.

Maintenance

New Features

  • Allow me to access loss values for GAN-based synthesizers - Issue #1671 by @frances-h
  • Create a unified get_parameters method for all multi-table synthesizers - Issue #1674 by @frances-h
  • Set credentials key as variables - Issue #1680 by @R-Palazzo
  • Identifying PII Sdtypes in Metadata - Issue #1683 by @R-Palazzo
  • Make SDV compatible with the latest SDMetrics - Issue #1687 by @fealho
  • SingleTablePreset uses FrequencyEncoder - Issue #1695 by @fealho

Bugs Fixed

  • HMASynthesizer creates too much synthetic data (always creates a child for every parent row) - Issue #1673 by @frances-h