Single pendulum on a cart - Identification results #4

BystrickyK · 2021-03-09T21:36:23Z

Analytical model with all physical parameters = 1

State data X was generated by a numerical simulation of an analytically derived model. State derivatives dX (dot X) were computed from state data X using spectral differentiation. Other nonlinear functions were constructed from these signals.

The function library A (regression matrix) signals A_col (regressors) look as follows:

Correlation matrix of the function library:

The goal of regression is to find sparse right-hand side solution (x) to a left-hand side guess term (b). There's only one regression hyperparameter - the threshold. For each LHS guess term, the regression is repeated many times for different threshold values. Nonzero, unique solutions are then saved for future comparison and modelling.

After trying all the possible LHS guess terms and solving the regression problems, we're left with a set of implicit equations that are candidates for equations describing the system dynamics. If two particular terms are in the real dynamics, each one of those should generate a solution if used as a LHS guess. We can use this fact to look at all the models and pick the models which contain the same active terms.

Each model equation is described by a vector of parameters and the corresponding term. The identified implicit equations are below, each row is a separate model. The labels on the left are in the form "{LHS term guess | model index}:

For each model, I calculate an activation vector, that describes which parameters are non-zero, or which terms were identified as active in the model. Non-zero elements of the original solution vector become 1, others stay 0:

We can then compare different models by calculating the L1 distance between their activation vectors. An activation distanceof 0 means that the models contain exactly the same terms, although their actual parameters will be almost always different, because the equations are implicit. An activation distance of 1 means the models differ by one term, and so on...

Matrix of activation distances between identified implicit equations:

The labels for each distance matrix element contain two parts, the left one is the LHS guess and the right one is the model index. Activation distances on the diagonal are all 0, because each model is identical to itself. The matrix is symmetrical, so we can focus only on the lower triangular part. We can for example see that model 8 (with a LHS guess of dX4) contains the same parameters as model 5. These models are therefore good candidates for equations describing the implicit dynamics.

Implicit models, normalized so that the smallest parameter is 1 or -1, so that the model parameters can be directly compared:

We can see that the implicit equations 5, 8 and 17 are nearly identical. These equations accurately describe the implicit form of the state equation for dX4. Note that it could also be used for calculating dX3, the simulation model was however defined so that dX3 = u, and the input signal u was removed from the function library, because it was perfectly correlated to dX3 and therefore would make the regression ill-conditioned.

Here are the results if input signal u is included in the function library:
Correlation matrix:

Activation distance matrix:

Implicit solutions:

It seems that the correct implicit equation was also identified (models 20, 10, 7).
The positions of x-axis labels for larger trig functions might be confusing, I'll rotate them so they're vertical next time.

BystrickyK · 2021-03-11T04:48:41Z

The regression part of the method spits out a lot of implicit equations. Each one of these equations has a potential to be a part of the actual, correct system of equations that describes the underlying system.
We can narrow down the equations that have a high likelihood to be correct by looking at the implicit equations that have the same structure for different LHS guesses. For example, if the actual equation describing a part of the dynamics was
A + 2B + 3C = 0
, then this equation should be identified whenever the LHS guess is A, B or C. For example, the regression result for a LHS guess A should be A = - 2B - 3C. I then subtract the LHS guess from both sides to get 0 = -A -2B -3C, so that the different equations can be compared directly. If the LHS guess was B, the result would be B = -0.5A -1.5C -> reordering into -> 0 = - 0.5A - 1B - 1.5C; the activate terms are exactly the same, although the parameters will differ by a scalar multiplier.

Similar implicit equations can be found by clustering them in the "activation space", where each identified equation is represented by its activation vector that tells which parameters in the equation are non-zero. The activation space is a space of natural numbers with dimensionality equal to the number of functions in the function library.
I use hierarchical clustering for finding the clusters, equations in each cluster can differ by no more than 1 term. The clusters with more than 2 elements then become serious candidates for describing the system.

The identified equations from the regression phase:

After clustering and dumping all equations that ended up in a cluster with less than 3 points:

Only one cluster had more than 3 elements, so all the identified models have the same (correct) structure.

If we keep clusters with less than 2 points, one more cluster (equation) makes it through:

The equations in the additional cluster aren't perfectly consistent with each other, as the one on the right has one more term (the "activation distance" between them is 1)

BystrickyK · 2021-03-11T08:09:32Z

The results are good even after adding powerful white noise to the state measurements, although a pre-processing kernel filtering step is necessary.
State data plots contain 3 lines for each state variable, one for clean simulation data, one for noisy data and one for filtered, although they're hard to see because of the overlapping.

Numerical state derivatives are especially sensitive to noise. The derivatives here are computed using spectral differentiation on filtered state data. As before, 3 lines are drawn for each signal.

Zooming on a part of the state derivative signal. There are noticeable deviations from the derivative signal computed from the clean data: