Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TD-Error not meaningful? #13

Open
pixelgruff opened this issue Nov 13, 2015 · 1 comment
Open

TD-Error not meaningful? #13

pixelgruff opened this issue Nov 13, 2015 · 1 comment

Comments

@pixelgruff
Copy link

We exposed the class member td_error in the TDControlAgent in an attempt to log algorithm progress, but the values seem noisy and do not converge even when the agent finds an excellent representation. Is td_error hidden for a reason; are we missing something by trying to track it?

@chrodan
Copy link
Member

chrodan commented Nov 13, 2015

The variable is the td-error with respect to the last observed transition. This quantity does not converge except for MDPs with deterministic transitions. The expectation of the td-error converges if the policy / Q-function has converged. You can use that as an indicator for convergence. You could for example compute a (rolling) average of this the variable and check that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants