TD-Error not meaningful? #13

pixelgruff · 2015-11-13T15:24:01Z

We exposed the class member td_error in the TDControlAgent in an attempt to log algorithm progress, but the values seem noisy and do not converge even when the agent finds an excellent representation. Is td_error hidden for a reason; are we missing something by trying to track it?

chrodan · 2015-11-13T15:57:59Z

The variable is the td-error with respect to the last observed transition. This quantity does not converge except for MDPs with deterministic transitions. The expectation of the td-error converges if the policy / Q-function has converged. You can use that as an indicator for convergence. You could for example compute a (rolling) average of this the variable and check that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TD-Error not meaningful? #13

TD-Error not meaningful? #13

pixelgruff commented Nov 13, 2015

chrodan commented Nov 13, 2015

TD-Error not meaningful? #13

TD-Error not meaningful? #13

Comments

pixelgruff commented Nov 13, 2015

chrodan commented Nov 13, 2015