Deployment requirements based on libtorch or ONNX #358

wangrui9720 · 2024-04-09T10:41:03Z

After training ctgan, we hope to use C++ to call this model to work in real time. After trying, ctgan can't be deployed in torchscript and other formats, because the input and output data of ctgan are based on python's pandas library, while the input and output of libtorch are required to be in tensor format. We really need to provide a deployment method based on C++, which can improve the efficiency of software operation. We look forward to your proposal!

sdv-team · 2024-04-16T16:51:05Z

Hi @wangrui9720! It’s great to see your interest in the SDV ecosystem. This comment is a reminder to consult your legal before adopting the SDV into your project, as SDV (and most of the related libraries such as CTGAN) has source-available, BSL license.

For more information, you can read through our license FAQs (not legal advice) or our blog. For any other questions, please refer to our Support Page. You can also inquire about a commercial license to allow additional use.

srinify · 2024-05-09T16:54:57Z

Hi there @wangrui9720 do you mind sharing a bit more about your use case? A few suggestions to consider:

GaussianCopulaSynthesizer, from SDV, is an alternative model that is significantly faster than our GAN based models like CTGAN. SDV is our batteries-included framework that sits one level above CTGAN and offers a better user experience.
To speed up CTGAN model training time, you can often get very good synthetic data quality with less rows than you think. You can read more about our thinking and advice here.

wangrui9720 · 2024-05-13T01:54:48Z

Hi there @wangrui9720 do you mind sharing a bit more about your use case? A few suggestions to consider:

GaussianCopulaSynthesizer, from SDV, is an alternative model that is significantly faster than our GAN based models like CTGAN. SDV is our batteries-included framework that sits one level above CTGAN and offers a better user experience.

To speed up CTGAN model training time, you can often get very good synthetic data quality with less rows than you think. You can read more about our thinking and advice here.

This is the code that I call the trained ctgan model.

from ctgan import CTGAN
import pandas as pd

def load_ctgan_model():
model_path = 'Z:/project/pkl/ctgan-test.pkl'
ctgan = CTGAN.load(model_path)
return ctgan

def get_welding_parameters(ctgan, NG_piece, desired_rows=500, batch_size=100):

conditioned_data_list = []

while len(conditioned_data_list) < desired_rows:
   
    generated_data = ctgan.sample(batch_size)

    new_data = generated_data[generated_data[slice] == NG_piece]
 
    conditioned_data_list.extend(new_data.values)


conditioned_data = pd.DataFrame(conditioned_data_list, columns=generated_data.columns)

if len(conditioned_data) > desired_rows:
    conditioned_data = conditioned_data.iloc[:desired_rows]

average_welding_time = conditioned_data[time（ms）].mean()
average_welding_temp = conditioned_data[temp（℃）'].mean()

return average_welding_time, average_welding_temp

When I want to deploy the trained ctgan code for real-time output, I can only call this python code with c++. The Gaussiancoupulaasynthesizer you mentioned is also the python code that needs me to call Gaussiancoupulaasynthesizer with c++ to train, right? Looking forward to your reply!

srinify · 2024-05-21T13:29:36Z

Ah now I understand @wangrui9720 you're correct that CTGAN and SDV don't actually currently support portability of just the machine learning model. The pkl file also contain a lot of Python library context because all that context is usually needed to run the Synthesizer capabilities to generate synthetic data.

We have a feature request issue in SDV to enable the exporting of just the model weights: sdv-dev/SDV#1970

I'll close this issue off and will add your use case over there so we can collect more examples for the team to prioritize! Thanks!

wangrui9720 added the new Label applied to new issues label Apr 9, 2024

srinify added under discussion Issue is currently being discussed and removed new Label applied to new issues labels May 9, 2024

srinify changed the title ~~Deployment requirements based on libtorch or onnx！~~ Deployment requirements based on libtorch or ONNX May 9, 2024

srinify closed this as completed May 21, 2024

srinify added resolution:WAI The software is working as intended and removed under discussion Issue is currently being discussed labels May 21, 2024

srinify mentioned this issue May 21, 2024

I want to be able to use synthesizer models in a standalone way (just the model weights) sdv-dev/SDV#1970

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deployment requirements based on libtorch or ONNX #358

Deployment requirements based on libtorch or ONNX #358

wangrui9720 commented Apr 9, 2024

sdv-team commented Apr 16, 2024

srinify commented May 9, 2024

wangrui9720 commented May 13, 2024

srinify commented May 21, 2024

Deployment requirements based on libtorch or ONNX #358

Deployment requirements based on libtorch or ONNX #358

Comments

wangrui9720 commented Apr 9, 2024

sdv-team commented Apr 16, 2024

srinify commented May 9, 2024

wangrui9720 commented May 13, 2024

srinify commented May 21, 2024