Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid generating the conditional column #292

Open
saart opened this issue May 8, 2023 · 0 comments
Open

Avoid generating the conditional column #292

saart opened this issue May 8, 2023 · 0 comments
Labels
pending review This issue needs to be further reviewed, so work cannot be started question General question about the software

Comments

@saart
Copy link

saart commented May 8, 2023

Environment details

  • CTGAN version: 0.7.1 (latest)
  • Python version: 3.10.11
  • Operating System: Mac/Unix

Problem description

I want to generate data conditionally, but I don't want to include the conditioned column in the output of the generator.

What I already tried

Currently, I just trim this column from the output.
Intuitively, it creates a big waste everywhere: the network is bigger (thus slower), and the model size is bigger.

Example:

Data that holds two columns: hospital name and patient's age.
Let's assume that there are 100 different hospitals, and my sole use of the generative model is to generate new rows for a given hospital.
Currently, the model will create 101 input features: 100 one-hot features (for hospital names) and one continuous feature (for age).

@saart saart added pending review This issue needs to be further reviewed, so work cannot be started question General question about the software labels May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending review This issue needs to be further reviewed, so work cannot be started question General question about the software
Projects
None yet
Development

No branches or pull requests

1 participant