Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating model summary #277

Open
9jaswag opened this issue Apr 5, 2023 · 7 comments
Open

Generating model summary #277

9jaswag opened this issue Apr 5, 2023 · 7 comments
Labels
question General question about the software under discussion Issue is currently being discussed

Comments

@9jaswag
Copy link

9jaswag commented Apr 5, 2023

Environment details

If you are already running CTGAN, please indicate the following details about the environment in
which you are running it:

  • CTGAN version: SDV 1.0
  • Python version: 3.9
  • Operating System: Google Colab

Problem description

First off, great job with what has been done on the CTGAN and TVAE models. I'd like to find out if it's currently possible to generate the model summary of a trained CTGANSynthesizer or TVAESynthesizer?

What I already tried

Tried looking at the docs, but couldn't find any mention of such.

@9jaswag 9jaswag added pending review This issue needs to be further reviewed, so work cannot be started question General question about the software labels Apr 5, 2023
@npatki
Copy link
Contributor

npatki commented Apr 6, 2023

Hi @9jaswag, nice to meet you and thanks for the kind words!

Curious what kind of information you'd want to see in a summary? Is there a particular usage or project you have in mind?

While there is no such summary available, you can sample synthetic data and then generate reports that compare the real vs. synthetic data. That should provide you some useful information to get started in evaluating the model.

For more information, check out our SDMetrics library. You can find reports, metrics and visualizations.

@npatki npatki added under discussion Issue is currently being discussed and removed pending review This issue needs to be further reviewed, so work cannot be started labels Apr 6, 2023
@9jaswag
Copy link
Author

9jaswag commented Apr 7, 2023

Thanks for your response @npatki. While tinkering with models in the past, I've been able to generate model summaries (e.g with keras model.summary()) whenever I need it for "documentation purposes". I was hoping I'd be able to do same for the synthesisers SDV offers.

@npatki
Copy link
Contributor

npatki commented Apr 7, 2023

Hi @9jaswag, I'm not as familiar with the keras library. What information would you like to see in the summary? How are you using the summaries?

Here are a few other things you can do:

  • For parametric models like GaussianCopulaSynthesizer, you can use the get_learned_parameters method to see what was learned.
  • Neural network models such as CTGAN are not parametric. A neural network architecture and weights are not easily interpretable by humans, so I'm not sure about the usage of that.
  • For any model, you can use the save method to save all the learned values

@9jaswag
Copy link
Author

9jaswag commented Apr 7, 2023

@npatki here's a model summary sample I got from a quick Google search.
image

I'll check out your get_learned_parameters suggestion. Thanks!

@npatki
Copy link
Contributor

npatki commented Apr 7, 2023

Hi @9jaswag just curious, how are you using the Layer, Output Shape and Param # information for your project?

@9jaswag
Copy link
Author

9jaswag commented Apr 9, 2023

Mostly to write up a descriptive summary of the model. Anyone who looks at it can get a general information about the model. Doing a quick search, I noticed there's an attempt to create something similar for pytorch

@Deepam-Rai
Copy link

@9jaswag Did it work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question General question about the software under discussion Issue is currently being discussed
Projects
None yet
Development

No branches or pull requests

3 participants