Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cost for trajectory following #44

Open
Pengxiao-Gao opened this issue Apr 27, 2020 · 3 comments
Open

Cost for trajectory following #44

Pengxiao-Gao opened this issue Apr 27, 2020 · 3 comments

Comments

@Pengxiao-Gao
Copy link

Hi, I'm trying to use PILCO on Path tracking for my graduation thesis, but for now the control results are not ideal.
I think it could be improved with a reword for trajectory following.
Do you know an easy way to do this ?
Thanks a lot for the help

Stefan

@maxvanmeer
Copy link

I'm wondering the same thing, did you ever figure this out?

@nrontsis
Copy link
Owner

nrontsis commented Jun 4, 2020

The reward is calculated here, so perhaps you could modify it to add what you need?

@maxvanmeer
Copy link

I think I managed to implement it in the original Matlab version. What you can do is:

  • Change the linear policy from M = Wm + b to M = Wm + b * r(t) for the current timestep t (make sure this t is passed to the function). Change the policy gradient dMdp as well - its gradient w.r.t. b used to be 1, but is r(t) now. I do not believe the gradient w.r.t the variance changes.
    Alternatively, use another parametrization, as long as it uses r(t).
  • Pass the current time t to the cost function as well, use this r(t) for the immediate reward instead of a fixed x_target

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants