Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Adds in analytic objectives #31

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Conversation

moorepants
Copy link
Member

Fixes issue #30.

@moorepants
Copy link
Member Author

@tvdbogert I've goofed this up but don't understand the math that I'm doing. Maybe you can tell me what I'm doing wrong.

Ok, we have a system with two states: x_1(t), x_1(t) and one input: r_1(t). We want to find the input r_1(t) that satisfies and objective. Let's say our objective, written in the continuous domain is:

This can be approximated in the discrete sense with:

If the free variables, theta_d, are:

Then the partials of J_d with respect to theta_d:

That's what I've been doing for numerical gradients so far. But I was hoping to doing the partial derivative in the continuous domain and then convert but I'm not getting the same answer. So starting again with:

And saying that theta is:

Then taking the partial of J wrt to theta:

So then I make the jump from continuous to discrete and I get:

Which is not the same!

Where am I goofing up my reasoning in the purely continuous approach? Ideally a user writes an analytic continuous objective function and then I parse it creating the numerical function and the gradient needed for the NLP problem. I'm making some dumb assumption.

There are facilities for working with "Indexed" types in SymPy so maybe I should use that instead of trying to jump from continuous to discrete domain.

@tvdbogert
Copy link
Member

Your last one is wrong. You can't take the partial derivative of J with respect to a function r(t).

The first one is correct. You first discretize the problem, so the unknowns are a finite set of variables, not continuous functions of time. Then you take partial derivatives with respect to those variables.

Maybe you can let the user define a continuous function to be integrated, and then you have to discretize it before doing the differentiation.

The way GPOPS does it, they only asks the user to write code that computes the integrand at collocation points. The integration is done behind the scenes. Derivatives of the optimization objective are either done numerically, or by applying an automatic differentiation tool to the user's code. (http://www.gpops2.com/resources/gpops2UsersGuide.pdf). In your case, you ask the user to define a function symbolically (rather than with code). To the user that looks exactly the same especially if the code is vectorized so there is no for-loop over the collocation points.

@moorepants
Copy link
Member Author

Thanks. I see my mistake now. I did all this correctly in the constraint code but blanked out here. I'm not sure I'll be able to get it all working like I want before the conferences but you'll at least be able to pass in a custom functions like the current functionality.

I think I can do this correctly by converting the continuous expression into an expression of indexed typed variables, which is supported in SymPy, so the above will turn into:

J_d = Sum(h * r[i]**2, 1, N)

and the then the derivative can be found by:

J_d.diff(r[i])

which will give the correct solution:

2 * h * r[i]

@tvdbogert
Copy link
Member

Some additional comments which may be useful.

In the 2-link swingup, the objective is specified as:

def obj(free):
F = free[num_states * num_nodes:]
return interval_value * np.sum(F**2)

It seems that the author of this code needs to know that the discrete time controls F are stored at that particular place in the "free" array, and that they are sampled on a uniform grid so integration is equivalent to h*sum when doing Euler/Trapezoidal. (actually shouldn't the first and last sample be divided by two?)

It may be more elegant if the user does not have to think about the time nodes and can specify an objective function in continuous time form. Analogous to how they already specify the equations of motion. The objective is generally an integral of a given function of x(t) and u(t), plus a terminal cost which is a function of x(tend) and u(tend). https://en.wikipedia.org/wiki/Optimal_control also has the terminal cost depend on x(0) and u(0).

So you could allow an "cost rate" and a "terminal cost" to be defined symbolically. Then opty can internally convert the integral into its discrete-time equivalent, based on the time grid that was chosen. This makes the API more general and extendable towards more sophisticated discretization methods, such as non-uniform time steps, and automatic mesh refinement. And even pseudo-spectral collocation as used in GPOPS.

Another way to accomplish this is to define an extra state variable (cost) and an extra differential equation with the integrand (cost rate) on the right hand side. Then you only need a final cost, which is a function of the final state only. Then the integration of the cost rate is done with the same (Euler, Trapezoidal, etc.) approximation as the equations of motion. This can be done internally (probably better), or you could instruct the user to add that differential equation to the dynamics, and then they can also provide the expression for cost as a function of final state.

For instance, the metabolic cost rate of muscles is a certain complicated function of muscle state variables, and it must be integrated.

What you have now works fine, this is more for long term.

@moorepants
Copy link
Member Author

It seems that the author of this code needs to know that the discrete time controls F are stored at that particular place in the "free" array

Yes, this is buried in the documentation of the functions, but would be explained in the docs.

actually shouldn't the first and last sample be divided by two?

Not sure. I may have it wrong. Please advise.

It may be more elegant if the user does not have to think about the time nodes and can specify an objective function in continuous time form.

Yes, that is the goal.

Another way to accomplish this is to define an extra state variable (cost) and an extra differential equation with the integrand (cost rate) on the right hand side.

Cool, I'll think about that. I'd love to improve this more...maybe you have a student that would want to work on it 😄

@moorepants
Copy link
Member Author

Or maybe I'll have some students soon that could work on it...but probably only undergrad.

@tvdbogert
Copy link
Member

Or a community might develop. I will certainly plant that seed when I present this, and you never know who might join the effort.

@tvdbogert
Copy link
Member

actually shouldn't the first and last sample be divided by two?

Not sure. I may have it wrong. Please advise.

An Euler integration would be exactly this summation but without either the first or the last node (depending on forward or backward Euler). But then you have one control parameter that is not counted in the cost and it may get large. Not a good idea.

A trapezoidal integration would include both the first and the last but multiplied by h/2 instead of h.

The straight summation you have is fine, because the error goes to zero as h goes to zero, so it converges to the same solution as you increase the number of nodes N.

I have always done the straight summation (like you have), but thinking about the cost as an extra state made me think, why not integrate it the same way as the equations of motion? But it's a minor issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants