Skip to content

Commit

Permalink
make release-tag: Merge branch 'main' into stable
Browse files Browse the repository at this point in the history
  • Loading branch information
amontanez24 committed Oct 13, 2023
2 parents 2d11113 + 52b5de0 commit fa1da68
Show file tree
Hide file tree
Showing 40 changed files with 3,431 additions and 1,354 deletions.
6 changes: 3 additions & 3 deletions CONTRIBUTING.rst
Expand Up @@ -173,17 +173,17 @@ Release Workflow
The process of releasing a new version involves several steps combining both ``git`` and
``bumpversion`` which, briefly:

1. Merge what is in ``master`` branch into ``stable`` branch.
1. Merge what is in ``main`` branch into ``stable`` branch.
2. Update the version in ``setup.cfg``, ``sdv/__init__.py`` and
``HISTORY.md`` files.
3. Create a new git tag pointing at the corresponding commit in ``stable`` branch.
4. Merge the new commit from ``stable`` into ``master``.
4. Merge the new commit from ``stable`` into ``main``.
5. Update the version in ``setup.cfg`` and ``sdv/__init__.py``
to open the next development iteration.

.. note:: Before starting the process, make sure that ``HISTORY.md`` has been updated with a new
entry that explains the changes that will be included in the new version.
Normally this is just a list of the Pull Requests that have been merged to master
Normally this is just a list of the Pull Requests that have been merged to main
since the last release.

Once this is done, run of the following commands:
Expand Down
33 changes: 33 additions & 0 deletions HISTORY.md
@@ -1,5 +1,38 @@
# Release Notes

## 1.5.0 - 2023-10-13

Several improvements and bug fixes were made in this release. Most notably, the metadata detection was substantially improved. Support for the 'unknown' sdtype was added, providing more flexibility in data representation. The software now attempts to intelligently detect primary keys and identify parent-child relationships in the metadata, streamlining the metadata creation process.

Additionally, issues related to conditional sampling with negative float values, the inability to update transformers for columns created by constraints, and compatibility with numpy version 1.25 and higher were addressed. The default branch was also switched from 'master' to 'main' for better development practices. Various bugs and errors, including those involving HMA and datetime format detection, were also resolved.

### New Features

* Improve metadata detection - Issue [#1515](https://github.com/sdv-dev/SDV/issues/1515) by @R-Palazzo
* Support 'unknown' sdtype - Issue [#1516](https://github.com/sdv-dev/SDV/issues/1516) by @R-Palazzo
* Detect primary keys in metadata - Issue [#1521](https://github.com/sdv-dev/SDV/issues/1521) by @frances-h
* Detect relationships in MultiTableMetadata - Issue [#1522](https://github.com/sdv-dev/SDV/issues/1522) by @frances-h
* Make function to estimate number of columns HMA produces. - Issue [#1572](https://github.com/sdv-dev/SDV/issues/1572) by @fealho
* Add wrapper for get_cardinalty_plot - Issue [#1573](https://github.com/sdv-dev/SDV/issues/1573) by @frances-h
* [Metadata detection] Add a cardinality cap when choosing between categorical vs. numerical - Issue [#1584](https://github.com/sdv-dev/SDV/issues/1584) by @pvk-developer
* [Metadata Detection] Only make primary/foreign keys sdtype `id` (leave others as `unknown`) - Issue [#1598](https://github.com/sdv-dev/SDV/issues/1598) by @amontanez24
* Check and supply a more descriptive error when trying to use `'gaussian_kde'` with HMA - Issue [#1604](https://github.com/sdv-dev/SDV/issues/1604) by @frances-h

### Bugs Fixed

* Conditional sampling with negative float values doesn't work - Issue [#1161](https://github.com/sdv-dev/SDV/issues/1161) by @fealho
* Cannot update transformers for columns that get created by constraints (`KeyError`) - Issue [#1454](https://github.com/sdv-dev/SDV/issues/1454) by @frances-h
* HMA produces KeyError for a schema with 3+ levels of depth - Issue [#1558](https://github.com/sdv-dev/SDV/issues/1558) by @fealho
* Columns consisting of only Nones are being detected as datetime - Issue [#1589](https://github.com/sdv-dev/SDV/issues/1589) by @pvk-developer
* HMASynthesizer throws an error when sampling multi table models with three levels of depths - Issue [#1600](https://github.com/sdv-dev/SDV/issues/1600) by @amontanez24
* `ValueError: Invalid distribution specification` when setting numerical_distributions on child table (HMA) - Issue [#1605](https://github.com/sdv-dev/SDV/issues/1605) by @fealho
* Bug: updating transformers in DataProcessor resets warning filters - Issue [#1618](https://github.com/sdv-dev/SDV/issues/1618) by @rwedge

### Maintenance

* Investigate how to get numpy >1.25 to pass - Issue [#1501](https://github.com/sdv-dev/SDV/issues/1501) by @rwedge
* Switch default branch from master to main - Issue [#1550](https://github.com/sdv-dev/SDV/issues/1550) by @amontanez24

## 1.4.0 - 2023-08-23

This release makes multiple improvements to the metadata. Both the single and multi table metadata classes now have a `validate_data` method. This method runs checks to validate the data against the current specifications in the metadata. The `SingleTableMetadata.visualize` is also improved. The sequence index is now shown in the same section as the sequence key. It also now shows all key and index information (eg. sequence key, primary key, sequence index) in one section.
Expand Down
28 changes: 14 additions & 14 deletions Makefile
Expand Up @@ -187,22 +187,22 @@ publish: dist publish-confirm ## package and upload a release
twine upload dist/*

.PHONY: bumpversion-release
bumpversion-release: ## Merge master to stable and bumpversion release
bumpversion-release: ## Merge main to stable and bumpversion release
git checkout stable || git checkout -b stable
git merge --no-ff master -m"make release-tag: Merge branch 'master' into stable"
git merge --no-ff main -m"make release-tag: Merge branch 'main' into stable"
bumpversion release
git push --tags origin stable

.PHONY: bumpversion-release-test
bumpversion-release-test: ## Merge master to stable and bumpversion release
bumpversion-release-test: ## Merge main to stable and bumpversion release
git checkout stable || git checkout -b stable
git merge --no-ff master -m"make release-tag: Merge branch 'master' into stable"
git merge --no-ff main -m"make release-tag: Merge branch 'main' into stable"
bumpversion release --no-tag
@echo git push --tags origin stable

.PHONY: bumpversion-patch
bumpversion-patch: ## Merge stable to master and bumpversion patch
git checkout master
bumpversion-patch: ## Merge stable to main and bumpversion patch
git checkout main
git merge stable
bumpversion --no-tag patch
git push
Expand All @@ -221,7 +221,7 @@ bumpversion-major: ## Bump the version the next major skipping the release

.PHONY: bumpversion-revert
bumpversion-revert: ## Undo a previous bumpversion-release
git checkout master
git checkout main
git branch -D stable

CLEAN_DIR := $(shell git status --short | grep -v ??)
Expand All @@ -234,10 +234,10 @@ ifneq ($(CLEAN_DIR),)
$(error There are uncommitted changes)
endif

.PHONY: check-master
check-master: ## Check if we are in master branch
ifneq ($(CURRENT_BRANCH),master)
$(error Please make the release from master branch\n)
.PHONY: check-main
check-main: ## Check if we are in main branch
ifneq ($(CURRENT_BRANCH),main)
$(error Please make the release from main branch\n)
endif

.PHONY: check-history
Expand All @@ -247,7 +247,7 @@ ifeq ($(CHANGELOG_LINES),0)
endif

.PHONY: check-release
check-release: check-clean check-master check-history ## Check if the release can be made
check-release: check-clean check-main check-history ## Check if the release can be made
@echo "A new release can be made"

.PHONY: release
Expand All @@ -257,10 +257,10 @@ release: check-release bumpversion-release publish bumpversion-patch
release-test: check-release bumpversion-release-test publish-test bumpversion-revert

.PHONY: release-candidate
release-candidate: check-master publish bumpversion-candidate
release-candidate: check-main publish bumpversion-candidate

.PHONY: release-candidate-test
release-candidate-test: check-clean check-master publish-test
release-candidate-test: check-clean check-main publish-test

.PHONY: release-minor
release-minor: check-release bumpversion-minor release
Expand Down
24 changes: 12 additions & 12 deletions README.md
Expand Up @@ -6,9 +6,9 @@

[![Dev Status](https://img.shields.io/badge/Dev%20Status-5%20--%20Production%2fStable-green)](https://pypi.org/search/?c=Development+Status+%3A%3A+5+-+Production%2FStable)
[![PyPi Shield](https://img.shields.io/pypi/v/SDV.svg)](https://pypi.python.org/pypi/SDV)
[![Unit Tests](https://github.com/sdv-dev/SDV/actions/workflows/unit.yml/badge.svg?branch=master)](https://github.com/sdv-dev/SDV/actions/workflows/unit.yml?query=branch%3Amaster)
[![Integration Tests](https://github.com/sdv-dev/SDV/actions/workflows/integration.yml/badge.svg?branch=master)](https://github.com/sdv-dev/SDV/actions/workflows/integration.yml?query=branch%3Amaster)
[![Coverage Status](https://codecov.io/gh/sdv-dev/SDV/branch/master/graph/badge.svg)](https://codecov.io/gh/sdv-dev/SDV)
[![Unit Tests](https://github.com/sdv-dev/SDV/actions/workflows/unit.yml/badge.svg?branch=main)](https://github.com/sdv-dev/SDV/actions/workflows/unit.yml?query=branch%3Amain)
[![Integration Tests](https://github.com/sdv-dev/SDV/actions/workflows/integration.yml/badge.svg?branch=main)](https://github.com/sdv-dev/SDV/actions/workflows/integration.yml?query=branch%3Amain)
[![Coverage Status](https://codecov.io/gh/sdv-dev/SDV/branch/main/graph/badge.svg)](https://codecov.io/gh/sdv-dev/SDV)
[![Downloads](https://static.pepy.tech/personalized-badge/sdv?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads)](https://pepy.tech/project/sdv)
[![Colab](https://img.shields.io/badge/Tutorials-Try%20now!-orange?logo=googlecolab)](https://docs.sdv.dev/sdv/demos)
[![Slack](https://img.shields.io/badge/Slack-Join%20now!-36C5F0?logo=slack)](https://bit.ly/sdv-slack-invite)
Expand All @@ -17,7 +17,7 @@
<br/>
<p align="center">
<a href="https://github.com/sdv-dev/SDV">
<img align="center" width=40% src="https://github.com/sdv-dev/SDV/blob/master/docs/images/SDV-logo.png"></img>
<img align="center" width=40% src="https://github.com/sdv-dev/SDV/blob/stable/docs/images/SDV-logo.png"></img>
</a>
</p>
</div>
Expand Down Expand Up @@ -54,15 +54,15 @@ and define business rules in the form of logical constraints.
[Blog]: https://datacebo.com/blog
[Docs]: https://bit.ly/sdv-docs
[Repository]: https://github.com/sdv-dev/SDV
[License]: https://github.com/sdv-dev/SDV/blob/master/LICENSE
[License]: https://github.com/sdv-dev/SDV/blob/main/LICENSE
[Development Status]: https://pypi.org/search/?c=Development+Status+%3A%3A+5+-+Production%2FStable
[Slack Logo]: https://github.com/sdv-dev/SDV/blob/master/docs/images/slack.png
[Slack Logo]: https://github.com/sdv-dev/SDV/blob/stable/docs/images/slack.png
[Community]: https://bit.ly/sdv-slack-invite
[Colab Logo]: https://github.com/sdv-dev/SDV/blob/master/docs/images/google_colab.png
[Colab Logo]: https://github.com/sdv-dev/SDV/blob/stable/docs/images/google_colab.png
[Tutorials]: https://docs.sdv.dev/sdv/demos

# Install
The SDV is publicly available under the [Business Source License](https://github.com/sdv-dev/SDV/blob/master/LICENSE).
The SDV is publicly available under the [Business Source License](https://github.com/sdv-dev/SDV/blob/main/LICENSE).
Install SDV using pip or conda. We recommend using a virtual environment to avoid conflicts with
other software on your device.

Expand All @@ -86,7 +86,7 @@ real_data, metadata = download_demo(
dataset_name='fake_hotel_guests')
```

![Single Table Metadata Example](https://github.com/sdv-dev/SDV/blob/master/docs/images/Single-Table-Metadata-Example.png)
![Single Table Metadata Example](https://github.com/sdv-dev/SDV/blob/stable/docs/images/Single-Table-Metadata-Example.png)

The demo also includes **metadata**, a description of the dataset, including the data types in each
column and the primary key (`guest_email`).
Expand Down Expand Up @@ -154,7 +154,7 @@ fig = get_column_plot(
fig.show()
```

![Real vs. Synthetic Data](https://github.com/sdv-dev/SDV/blob/master/docs/images/Real-vs-Synthetic-Evaluation.png)
![Real vs. Synthetic Data](https://github.com/sdv-dev/SDV/blob/stable/docs/images/Real-vs-Synthetic-Evaluation.png)

# What's Next?
Using the SDV library, you can synthesize single table, multi table and sequential data. You can
Expand Down Expand Up @@ -192,8 +192,8 @@ If you use SDV for your research, please cite the following paper:

<div align="center">
<a href="https://datacebo.com"><picture>
<source media="(prefers-color-scheme: dark)" srcset="https://github.com/sdv-dev/SDV/blob/master/docs/images/datacebo-logo-dark-mode.png">
<img align="center" width=40% src="https://github.com/sdv-dev/SDV/blob/master/docs/images/datacebo-logo.png"></img>
<source media="(prefers-color-scheme: dark)" srcset="https://github.com/sdv-dev/SDV/blob/stable/docs/images/datacebo-logo-dark-mode.png">
<img align="center" width=40% src="https://github.com/sdv-dev/SDV/blob/stable/docs/images/datacebo-logo.png"></img>
</picture></a>
</div>
<br/>
Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Expand Up @@ -134,7 +134,7 @@
'display_github': True,
'github_user': user,
'github_repo': project,
'github_version': 'master',
'github_version': 'main',
'conf_py_path': '/docs/',
}

Expand Down
6 changes: 3 additions & 3 deletions docs/developer_guides/contributing.rst
Expand Up @@ -170,17 +170,17 @@ Release Workflow
The process of releasing a new version involves several steps combining both ``git`` and
``bumpversion`` which, briefly:

1. Merge what is in ``master`` branch into ``stable`` branch.
1. Merge what is in ``main`` branch into ``stable`` branch.
2. Update the version in ``setup.cfg``, ``sdv/__init__.py`` and
``HISTORY.md`` files.
3. Create a new git tag pointing at the corresponding commit in ``stable`` branch.
4. Merge the new commit from ``stable`` into ``master``.
4. Merge the new commit from ``stable`` into ``main``.
5. Update the version in ``setup.cfg`` and ``sdv/__init__.py``
to open the next development iteration.

.. note:: Before starting the process, make sure that ``HISTORY.md`` has been updated with a new
entry that explains the changes that will be included in the new version.
Normally this is just a list of the Pull Requests that have been merged to master
Normally this is just a list of the Pull Requests that have been merged to main
since the last release.

Once this is done, run of the following commands:
Expand Down
2 changes: 1 addition & 1 deletion docs/getting_started/install.rst
Expand Up @@ -51,7 +51,7 @@ You can clone the repository and install it from source by running ``make instal
git checkout stable
make install
.. note:: The ``master`` branch of the SDV repository contains the latest development version.
.. note:: The ``main`` branch of the SDV repository contains the latest development version.
If you want to install the latest stable version, make sure not to omit the
``git checkout stable`` indicated above.

Expand Down
6 changes: 3 additions & 3 deletions docs/index.rst
Expand Up @@ -130,13 +130,13 @@ for specific needs.
.. |PyPi Shield| image:: https://img.shields.io/pypi/v/SDV.svg
:target: https://pypi.python.org/pypi/SDV
.. |Run Tests| image:: https://github.com/sdv-dev/SDV/workflows/Run%20Tests/badge.svg
:target: https://github.com/sdv-dev/SDV/actions?query=workflow%3A%22Run+Tests%22+branch%3Amaster
.. |Coverage Status| image:: https://codecov.io/gh/sdv-dev/SDV/branch/master/graph/badge.svg
:target: https://github.com/sdv-dev/SDV/actions?query=workflow%3A%22Run+Tests%22+branch%3Amain
.. |Coverage Status| image:: https://codecov.io/gh/sdv-dev/SDV/branch/main/graph/badge.svg
:target: https://codecov.io/gh/sdv-dev/SDV
.. |Downloads| image:: https://pepy.tech/badge/sdv
:target: https://pepy.tech/project/sdv
.. |Binder| image:: https://mybinder.org/badge_logo.svg
:target: https://mybinder.org/v2/gh/sdv-dev/SDV/master?filepath=tutorials
:target: https://mybinder.org/v2/gh/sdv-dev/SDV/main?filepath=tutorials
.. |Slack| image:: https://img.shields.io/badge/Slack%20Workspace-Join%20now!-36C5F0?logo=slack
:target: https://bit.ly/sdv-slack-invite

Expand Down
2 changes: 1 addition & 1 deletion docs/user_guides/benchmarking/docker.rst
Expand Up @@ -25,7 +25,7 @@ from DockerHub by running:
where ``<tag>`` is a qualifier for the desired version. You can use:

* ``stable`` to get the latest release
* ``latest`` (equivalent to not giving any tag at all) to get the latest development version from the master branch
* ``latest`` (equivalent to not giving any tag at all) to get the latest development version from the main branch
* ``<any other tag>`` to get the corresponding version

Run SDGym
Expand Down
4 changes: 2 additions & 2 deletions docs/user_guides/benchmarking/install.rst
Expand Up @@ -50,13 +50,13 @@ Install for development
If you intend to modify the source code or contribute to the project you
will need to install it from the source using the
``make install-develop`` command. In this case, we recommend you to
branch from ``master`` first:
branch from ``main`` first:

.. code:: bash
git clone git@github.com:sdv-dev/SDGym
cd SDGym
git checkout master
git checkout main
git checkout -b <your-branch-name>
make install-develp
Expand Down
2 changes: 1 addition & 1 deletion sdv/__init__.py
Expand Up @@ -6,7 +6,7 @@

__author__ = 'DataCebo, Inc.'
__email__ = 'info@sdv.dev'
__version__ = '1.4.0'
__version__ = '1.5.0.dev1'


import sys
Expand Down
15 changes: 12 additions & 3 deletions sdv/data_processing/data_processor.py
Expand Up @@ -483,6 +483,15 @@ def _create_config(self, data, columns_created_by_constraints):
)
sdtypes[column] = 'pii'

elif sdtype == 'unknown':
transformers[column] = AnonymizedFaker(
function_name='bothify',
)
transformers[column].function_kwargs = {
'text': 'sdv-pii-?????',
'letters': '0123456789abcdefghijklmnopqrstuvwxyz'
}

elif pii:
enforce_uniqueness = bool(column in self._keys)
transformers[column] = self.create_anonymized_transformer(
Expand Down Expand Up @@ -528,9 +537,9 @@ def update_transformers(self, column_name_to_transformer):
"'RegexGenerator' instead."
)

warnings.filterwarnings('ignore', module='rdt')
self._hyper_transformer.update_transformers(column_name_to_transformer)
warnings.resetwarnings()
with warnings.catch_warnings():
warnings.filterwarnings('ignore', module='rdt.hyper_transformer')
self._hyper_transformer.update_transformers(column_name_to_transformer)

def _fit_hyper_transformer(self, data):
"""Create and return a new ``rdt.HyperTransformer`` instance.
Expand Down
27 changes: 27 additions & 0 deletions sdv/evaluation/multi_table.py
Expand Up @@ -98,3 +98,30 @@ def get_column_pair_plot(real_data, synthetic_data, metadata, table_name, column
real_data = real_data[table_name]
synthetic_data = synthetic_data[table_name]
return report.get_column_pair_plot(real_data, synthetic_data, column_names, metadata)


def get_cardinality_plot(real_data, synthetic_data, child_table_name, parent_table_name,
child_foreign_key, metadata):
"""Get a plot of the cardinality of the parent-child relationship.
Args:
real_data (dict):
The real data.
synthetic_data (dict):
The synthetic data.
child_table_name (string):
The name of the child table.
parent_table_name (string):
The name of the parent table.
child_foreign_key (string):
The name of the foreign key column in the child table.
metadata (MultiTableMetadata):
Metadata describing the data
Returns:
plotly.graph_objects._figure.Figure
"""
metadata = metadata.to_dict()
return report.get_cardinality_plot(
real_data, synthetic_data, child_table_name, parent_table_name,
child_foreign_key, metadata)

0 comments on commit fa1da68

Please sign in to comment.