Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exercise 8: why more experiment causes the get_logs to not return? #61

Open
arisliang opened this issue May 12, 2018 · 2 comments
Open

Comments

@arisliang
Copy link
Contributor

arisliang commented May 12, 2018

The exercise suggested "You can also try running more of the experiment tasks and see what happens". While, after changing up the experiment runs, the get_logs would not immediately return results anymore.

I'm curious of why is this behavior? Does it have something to do the number of CPU that the number of experiments can't be more than the number of CPU - 1?

@arisliang arisliang changed the title exercise 8: more experiment causes the get_logs to not return exercise 8: why more experiment causes the get_logs to not return? May 12, 2018
@robertnishihara
Copy link
Contributor

Yes, I think that's what's happening. If all of the CPUs are running run_experiment tasks, then the actor method task never gets scheduled. However, you can give a "dedicated core to the actor by changing the decorator to @ray.remote(num_cpus=1), in which case the actor will always own one core even if it is not doing anything.

@arisliang
Copy link
Contributor Author

arisliang commented May 13, 2018

While running the experiment, the CPU is not 100% utilized, which makes sense since the only thing run_experiment does is to sleep, which shouldn't block the CPU.

  • So why the actor method task still never gets scheduled, while the CPU isn't busy?
  • Does it mean that each run_experiment task also "own one core" even if it is not doing anything, like the way you mentioned what we can specify for actor?
  • And if actor does own one core, the number of run_experiment tasks (3) would now be more than the available cores (2), does it mean that some of the run_experiment task never gets scheduled like the actor previously?
  • Does the driver process also "own one core"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants