Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computation time for policy optimization #39

Open
dqxajzh opened this issue Jan 3, 2020 · 3 comments
Open

Computation time for policy optimization #39

dqxajzh opened this issue Jan 3, 2020 · 3 comments

Comments

@dqxajzh
Copy link

dqxajzh commented Jan 3, 2020

I find that the computation time for policy optimization will gradually increase, and the project is terminated by the tensorflow ResourceExhaustedError.

@NicolayP
Copy link

My guess is that this is inherently due to the very nature of Gaussian Processes. GPs keeps all the data in memory that will then be used to do a prediction. The more you run pilco the more samples will be collected and thus the prediction time will increase. If your familiar with the big O notation, gp prediction time is O(n^3) where n is the number of samples. There is some research going on to reduce this ( sparse gaussians etc) but in the overall, the more samples you have, the longer the policy optimization will be.

@nrontsis
Copy link
Owner

I think I agree with @NicolayP. You might want to use sparse gaussian processes, which are already implemented in PILCO.

Let me know if this helps with your problem.

@dqxajzh
Copy link
Author

dqxajzh commented Feb 18, 2020

Thank you for your help @nrontsis @NicolayP, I will try the sparse gaussian processes PILCO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants