Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't fit with ddp_notebook on a Vertex AI Workbench instance (CUDA initialized) #19880

Open
jasonbrancazio opened this issue May 16, 2024 · 0 comments
Labels
bug Something isn't working needs triage Waiting to be triaged by maintainers

Comments

@jasonbrancazio
Copy link

Bug description

Using this minimal code example:

import torch
import lightning as L

print(torch.cuda.is_initialized())
trainer = L.Trainer(
    accelerator="auto", 
    strategy="ddp_notebook",
    devices="auto", 
    max_epochs=1, 
    # callbacks=callbacks,
    log_every_n_steps=1
)
print(torch.cuda.is_initialized())

On Google Colab with a T4 attached, both print statements print "False" as expected.

On a Vertex AI Workbench instance with a T4 attached, the second statement prints "True"; merely instantiating the Trainer initializes cuda. This prevents fitting with DDP.

What could be causing this, and is there any way to work around it?

What version are you seeing the problem on?

v2.2

How to reproduce the bug

No response

Error messages and logs

# Error messages and logs here please

Environment

Current environment
#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0):
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 2.0):
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):
#- Running environment of LightningApp (e.g. local, cloud):

More info

No response

@jasonbrancazio jasonbrancazio added bug Something isn't working needs triage Waiting to be triaged by maintainers labels May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Waiting to be triaged by maintainers
Projects
None yet
Development

No branches or pull requests

1 participant