Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pseudocode for "better" policy evaluation in CEM #399

Open
dniku opened this issue Apr 29, 2020 · 0 comments
Open

Pseudocode for "better" policy evaluation in CEM #399

dniku opened this issue Apr 29, 2020 · 0 comments

Comments

@dniku
Copy link
Collaborator

dniku commented Apr 29, 2020

The end of the notebook suggests evaluating the policy in a "theoretically better" way by sampling an initial action for each initial state uniformly and then playing with the current policy until the end. A user on Coursera forum reports that pseudocode would make the idea clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant