Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using a gpu with ray locally. #14

Open
lorrp1 opened this issue Jul 2, 2020 · 2 comments
Open

using a gpu with ray locally. #14

lorrp1 opened this issue Jul 2, 2020 · 2 comments

Comments

@lorrp1
Copy link

lorrp1 commented Jul 2, 2020

Hello, im trying to understand how to use my nvidia gpu locally using the "distributed" version and ray.

torch.cuda.get_device_name(0) return the name of 1070
torch.cuda.is_available() returns True

i tried modifying "num_gpus" both on "mayberay" and "dist".
also as found on ray documentation i added ray.get_gpu_ids() which correctly returns the number on "num_gpus" of ray.int(num_gpus ..) on mayberay.py

the program works fine apparently but when i check with "watch -n 2 nvidia-smi" it does not seems to use the gtx all.

im using the example on deepCRF which uses leduc with a lower number of workers.

i can't find any solution on ray's documentation.

@lorrp1 lorrp1 changed the title using a gpu on ray locally. using a gpu with ray locally. Jul 2, 2020
@EricSteinberger
Copy link
Owner

Hi!

I fear I can't help with individual ray or gpu issues; it works on my computer, so I'm not sure how to reproduce this. You might be launching more than one worker with a GPU? This could cause the program to crash. Otherwise, not sure what the issue could be.

@lorrp1
Copy link
Author

lorrp1 commented Jul 3, 2020

it has never crashed it just worked without using the gpu even if i set "@ray.remote(num_cpus=1, num_gpus=1)" , i just noticed that i had a newer version of ray thats not the same its required by pokerRL[distributed].
but the result is the same.

i checked ray-project/ray#5940 and os.listdir(proc_gpus_path)) return the right value.
i have tried changing @ray.remote(...) and also with just 1 actor and setting numcpus with lower values for the actor, but either it crashes or it works without gpu.

is there any other way to use the gpu without ray?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants