Computation time for policy optimization #39

dqxajzh · 2020-01-03T07:29:27Z

I find that the computation time for policy optimization will gradually increase, and the project is terminated by the tensorflow ResourceExhaustedError.

NicolayP · 2020-02-18T13:00:39Z

My guess is that this is inherently due to the very nature of Gaussian Processes. GPs keeps all the data in memory that will then be used to do a prediction. The more you run pilco the more samples will be collected and thus the prediction time will increase. If your familiar with the big O notation, gp prediction time is O(n^3) where n is the number of samples. There is some research going on to reduce this ( sparse gaussians etc) but in the overall, the more samples you have, the longer the policy optimization will be.

nrontsis · 2020-02-18T14:42:20Z

I think I agree with @NicolayP. You might want to use sparse gaussian processes, which are already implemented in PILCO.

Let me know if this helps with your problem.

dqxajzh · 2020-02-18T15:29:08Z

Thank you for your help @nrontsis @NicolayP, I will try the sparse gaussian processes PILCO

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Computation time for policy optimization #39

Computation time for policy optimization #39

dqxajzh commented Jan 3, 2020

NicolayP commented Feb 18, 2020

nrontsis commented Feb 18, 2020

dqxajzh commented Feb 18, 2020

Computation time for policy optimization #39

Computation time for policy optimization #39

Comments

dqxajzh commented Jan 3, 2020

NicolayP commented Feb 18, 2020

nrontsis commented Feb 18, 2020

dqxajzh commented Feb 18, 2020