Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long initialization time when using GPU #26

Open
fkendlessly opened this issue Nov 3, 2022 · 5 comments
Open

Long initialization time when using GPU #26

fkendlessly opened this issue Nov 3, 2022 · 5 comments

Comments

@fkendlessly
Copy link

fkendlessly commented Nov 3, 2022

Hi @borchero,
I am using the GPU for clustering or GMM and the initialization operation takes a long time compared to the CPU. After executing the following code segment on the RTX3090, the GPU initialization time is about 4.1 seconds. However, the CPU only takes about 0.17 seconds. Any suggestions to solve this problem?

from pycave.bayes import KMeans
import torch
import time 

input_data = torch.randn(90000, 3)
start_time = time.time()
estimator = KMeans(3, trainer_params = dict(gpus = 1, max_epochs = 10))
estimator.fit(input_data)
end_time = time.time()
print('cost_time: %f seconds', %(end_time - start_time))
@borchero
Copy link
Owner

borchero commented Nov 3, 2022

There is likely nothing I can do to make this any faster. In general, a process needs some time to initialize the GPU.

I think you can try running torch.cuda.init() and you will probably see that this operation takes ~4 seconds.

@fkendlessly
Copy link
Author

fkendlessly commented Nov 3, 2022

When I run torch.cuda.init(), it only takes 1e-5 seconds. Actually, I found out that the above code initialization is related to the estimator.py in the .../pycave/clustering/kmeans directory. I tested the 129th line of estimator.py, self.trainer(max_epochs=num_epochs).fit(module, loader), which took about 4 seconds.

@borchero
Copy link
Owner

borchero commented Nov 3, 2022

Can you also benchmark torch.empty(1).cuda()? I thought that torch.cuda.init() is the culprit but I'm quite certain that the delay is the first interaction with the GPU (I just don't know for sure when it's happening).

@fkendlessly
Copy link
Author

torch.empty(1).cuda() takes about 0.4 milliseconds.

@borchero
Copy link
Owner

borchero commented Nov 3, 2022

Mh ok, interesting. I don't think it has anything to do with PyCave but I will check again. Unfortunately, I don't have direct access to a GPU at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants