Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to measure RL sample efficiency #74

Open
rickyloynd-microsoft opened this issue Jun 17, 2019 · 1 comment
Open

How to measure RL sample efficiency #74

rickyloynd-microsoft opened this issue Jun 17, 2019 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@rickyloynd-microsoft
Copy link

Hi,

My question relates to the RL results in Table 3 of the paper. I’m trying to use the iclr19 branch to generate at least 10 such results (for each level) to get stable mean and variance. The train_rl.py script seems to do almost everything required. But at the bottom of that file, after calculating the success rate over the (default of 512) episodes that were tested, the success rate is not actually logged. The mean return is logged instead.

Adding the following line (right after the calculation of success_rate) seems to log the missing number:

logger.info("Success rate {: .4f} reached after {} training episodes".format(success_rate, status['num_episodes']))

Also, it seems that the default save_interval of 1000 is too large for some of the easier levels. For instance, to get sufficiently frequent tests on GoToRedBallGrey, I call the script like this:

python scripts/train_rl.py --env BabyAI-GoToRedBallGrey-v0 --save-interval 10

Then to obtain the sample efficiency, I just look in the log for the first success rate to exceed 0.99, and take the number of training episodes up to that point. For seed=1, it happens on this line:

main: 2019-06-17 01:27:36,671: Success rate 0.9922 reached after 30769 training episodes

Is this the right way to generate more RL results like those in Table 3? Or is there an easier way?

Thank you for this excellent environment!

@maximecb maximecb added the question Further information is requested label Jun 17, 2019
@rizar
Copy link
Collaborator

rizar commented Jun 18, 2019

Thanks for your question. In fact, we are using .csv logs to compute when 99% success rate is reached. There is a PR underway that automates the process, you can try using the rl_dataeff.py script from it:

https://github.com/mila-iqia/babyai/pull/72/files#diff-96cf5347366f1902ca3259f6a094df51

I will back from vacation on Thursday and I will try to provide more help then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants