Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi GPU support #290

Open
saswat0 opened this issue May 7, 2023 · 0 comments
Open

Multi GPU support #290

saswat0 opened this issue May 7, 2023 · 0 comments
Labels
feature request Request for a new feature pending review This issue needs to be further reviewed, so work cannot be started

Comments

@saswat0
Copy link

saswat0 commented May 7, 2023

Problem Description

The current implementation doesn't consider servers with multiple GPUs. For scenarios where several cards, each with a lower VRAM are present, running CTGAN throws an out of memory.

The below trace is during a run where a job was triggered on a T4 GPU (common in cloud servers). The real dataset had 26 columns and 20k rows.

OutOfMemoryError: CUDA out of memory. Tried to allocate 3.46 GiB (GPU 0; 14.76 GiB total capacity; 10.49 GiB already allocated; 621.75 MiB free; 13.38 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Expected behavior

CTGAN should be able to leverage PyTorch's DataParallel module such that model and data parallelism can be facilitated for bigger batch sizes.

@saswat0 saswat0 added feature request Request for a new feature pending review This issue needs to be further reviewed, so work cannot be started labels May 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature pending review This issue needs to be further reviewed, so work cannot be started
Projects
None yet
Development

No branches or pull requests

1 participant