Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fitting time domain data of electrical circuit - Best Algorithm? #2

Open
DerAlbiG opened this issue Nov 5, 2020 · 8 comments
Open

Comments

@DerAlbiG
Copy link

DerAlbiG commented Nov 5, 2020

Hi Rookfighter,
i try using your least-squares lib to fit measured time-domain signals of an electrical circuit to extract the circuits component values.
I have a reference implementation in python running and i have some sort of success using your lib with the GaussNewton method and the ArmijoBacktracking step size algorithm. (basically your provided sample but i use my own error function object).
However i fail to get the quality of fitting that python provided and the same error function is much slower in c++ compared to python.
Python obviously may use different algorithms and I wonder if you would know what algorithm would fit my needs.
And i fully realize that it is a damn specific question, and i dont know how to give more detail about it given this communication method.
Misfit

The funny thing is that this is supposed to work on a microcontroller and has to run factor 10.000x faster in the end. I am a bit lost.

@Rookfighter
Copy link
Owner

Rookfighter commented Nov 7, 2020

Hello DerAlbiG,

thanks for using least-squares-cpp. I hope I can help you with your issue.

Generally the performance of your optimization highly depends on your algorithm configuration.
Gauss Newton guarantees conversion towards the local optimum if the step lemgth is short enough, but converges very slowly in the end.
Gradient descent converges quickly, but only converges against the local minimum if the initial guess is close enough.
Then the configuration of your step size functor influences convergence guarantees and speed of convergence significantly. E.g. if you set the backtracking decrease of your ArmijoBacktracking functor to a very low value (very fine granular line search), it will take a significant amount of time to find a suitable step size for just a single optimization iteration. You can (and should) also set a maximum number of iterations for your line search.

Now, a question that bothers me is, which algorithm and step size functor do you use in your python version? Is it equivalent to your C++ implementation (GaussNewton and ArmijoBacktracking)?

As a first measure, maybe you can try the following configuration and see how the performance and results are:

lsq::LevenbergMarquardt <double, MyErrorFunctor> optimizer;
optimizer.setMaxIterations(100);
optimizer.setMaxIterationsLM(100);
optimizer.setLambdaIncrease(2);
optimizer.setLambdaDecrease(0.5);
optimizer.setVerbosity(2);
optimizer.setThreads(0);

Generally, if you are not sure about which algorithm fits your optimization problem, LevenbergMarquardt is a good and robust choice, because it is a combination of GaussNewton and GradientDescent. Also you don't need a line search algorithm here, which saves computation time.
You can also vary the parameters above to influence the behaviour of your optimization process:
Higher lambda increase and lambda decrease means more tendency to use GradientDecent.
Lower lambda increase and lambda decrease means more tendency to use GaussNewton.
Reduce MaxIterationsLM to potentially reduce computation time.
Increase MaxIterationsLM to increase the quality of your optimization result.

@Rookfighter
Copy link
Owner

If this does not work, feel free to send me your error functor and some sample data and I would love to play around with it ;)

@DerAlbiG
Copy link
Author

DerAlbiG commented Nov 7, 2020

Thank you a lot for your time and answer!
I have tried your suggested LevenbergMarquardt, but i honestly had no luck. The convergence is very slow and the GaussNewton with the ArmijoBacktracking still yields the best results. Meanwhile I have simplified the fitting problem in terms of number of unknowns and 'entaglement of the unknowns' and that alone helped a lot. (although it may complicate things further down the line).
Anyway, i am down to below 300 calls to my error function and that would already be in the range the microcontroller could handle. Of course i have no idea what kind of work is necessary in the background. I learn as i go, i guess.

Could it be a problem that my error function is an integral? Basically i am fitting an inductor current according to the attached formula.
Formula

Regarding my Python reference implementation.. i used scipy.optimize.curve_fit with the Method-parameter at default. Given I constrain the parameters there, it is either 'trf' or 'dogbox'. But i have no idea what this means. The python-implementation is conceptually different from what i can implement in C++ including my error function. I knew back then, that i just needed it right in Python. Now i need it fast ;-) [although i tried my python error function in C++ and it is simply too complicated for efficient solving, i think.]

As for the help-offering: i would loooooove to accept it, but i simply dont know how to contact you other than github issues... :-(
...and i also dont know how to share email or anything in private here. Kind of new to github.

@Rookfighter
Copy link
Owner

Rookfighter commented Nov 8, 2020

So, just that I get a clearer picture: Result quality is ok for you for now? So first priority is to improve performance, correct?

In terms of performance there are a few more things you can try:

First off compile your C++ project in "Release" mode (if you use CMake -DCMAKE_BUILD_TYPE=Release). The Eigen3 library performs extensive runtime checks, which are removed by the compiler in Release mode. Especially the NDEBUG symbol should be defined. This usually increases performance significantly.

Make sure you enable the OpenMP option for your compiler, such that multithreading is enabled for the optimization process (see here).

Now, you say you limited the number of evaluations of your error function to 300. How did you get / count that number? The safest way would be to add a counter to your error function, but be aware that it has to be thread-safe.

Keep in mind, that if any optimization algorithm of lsq prints 300 iterations, this is not equivalent to 300 evaluations of the error function. A single iteration performs multiple evaluations of the error function depending on different conditions:

  • 1 evaluation for your current state variable (get current error)
  • if no jacobian is provided by you, lsq uses central differences to estimate the Jacobian numerically. This requires 2 * n evaluations of your error function, where n is the dimension of your state variable / problem space / unknowns
  • line search (armijo backtracking) evaluates the error function for every step size it passes. If you limit the iterations of ArmijoBacktracking to 100, then in the worst case the line search can evaluate your error function 100 times.

Again keep in mind, all this accounts for a single iteration of the optimization algorithm.

I had a look at the scipy function you mentioned. So as I understand you, you actually have a constrained optimization problem, thus you used the trf method instead of lm (which is levenberg marquardt btw). Keep in mind that lsq is a library for unconstrained optimization problems (as constrained problems are harder to solve :D).

Can you tell me how big your problem space is (so the dimension of your xval / unknowns)? And how big is your error space (dimension of fval / number of error values)?
In principle you can only solve the unconstrained problem reliably with lsq if dim(xval) <= dim(fval). The more error values the better is the result quality and convergence behaviour.

Do I also understand correctly, that the unknowns in your system are I_n?

@Rookfighter
Copy link
Owner

Rookfighter commented Nov 8, 2020

For contacting me, simply use the email address on my profile site, so you can send me sample data and code snippets, but I'd like to keep the general discussion public here.

@DerAlbiG
Copy link
Author

DerAlbiG commented Nov 8, 2020

Result quality is ok for you for now? So first priority is to improve performance, correct?

Complicated. I need both :-) I have currently the python reference implementation error function running and that is slow but yields the best and most complete results. I also have tried to simplify the error function and that is running very well with nice convergence and low number of iterations / error function calls but it does not give me intrinsically everything i need.
I am in a pickle :-)

In my system i basically know which values to expect. So learning can take initially a long time (one time), but then with very good initial guess, i need fast convergence to observe component drift due to temperature change or mechanical stress/damage. One fitting procedure should not take longer than 5..10ms. On an STM32 (ARM Cortex M4 with hardware floating point support [including multiply-accumulate]) with 168MHz. Roughly speaking, i have 1 million cycles to spare.

This is why i am currently not really looking at time as a measurement of performance but number of iterations and calls to the error function. And yes, i use a global counter in the error function. I havent figured out / tried if i can change the default double to float or even a CustomFloat that counts every operation on it through operator overload - that would be the best metric since it would capture the background operations too.
OpenMP and compiler optimization are not relevant options right now. I am currently looking more broadly towards general algorithm complexity, so the number of operations is my primary figure of merit. Of course i will optimize later on :-D

The test-sequence consists of roughly 2000 Samples. This can be optimized with possible sub-sampling or other methods.
Too coarse sampling yields integration errors, working on the original data is preferred.
But as fa as I understand, it is possible to compress the error vector to reduce background operations.
The "problem space" is "3x fitting for 2...3 unknowns" over different parts of the data in the fast-but-incomplete models.
Or "1x fitting 8 unknowns" in the reference error function using the complete dataset.

The unknowns in my equation are C0, C1, C2 (that is the formula for the simplified model). I_n is "current" (sorry electrical stuff) and I_(n+1) is "the next current value". It is basically iteratively computing an integral "I_n+1 = I_n + Delta_I*dt". This formula is repeated for every sample. Then it is checked how "I" develops compared to the measurement on a sample-by sample basis. That difference ("Measurement - ModelPrediction"), or the square of it, is the error vector, and currently it is ~1000 long.

I unfortunately have no deeper understanding of the math beneath the surface. Although "Jacobian" is not an unheard-of word to me, i dont really know what it is or how to compute it. If providing the matrix would speed up the process, i am willing to learn.

But currently i am struggling with quite some amount of doubts. My measurement data was captured with very inferior hardware and over the last year i made some good progress in that regard. As I understand, fitting to better data with less noise and greater adherence to the actual model (lets call it "less second order effects") should also increase convergence speed. Measuring new data is unfortunately somewhat dangerous and it requires a lot of work. It will take me 3..4 weeks to literally get a better picture.

I have made the most progress by tuning the ArmijoBacktracking parameters.
I use it like this:
lsq::ArmijoBacktracking(0.8, 0.02, 1e-6, 1.0, 0)
lsq::ArmijoBacktracking(0.5, 0.01, 1e-6, 1.0, 0)
lsq::ArmijoBacktracking(0.65, 0.05, 1e-6, 0.5, 0)

Most important was seemingly the reduction of "c1" for faster convergence. (compared to sample code). But i am really a stupid monkey here. Just trial and error. I dont know what i am doing.

I will work on my code to clean it up. I will send it over later; Just a warning: I am not good with code distribution as I usually dont do it. And it is C++20.

@Rookfighter
Copy link
Owner

In your case you can try another option:
Generate the first initial guess with Gauss Newton and Armijo Backtracking. As you said this optimization process might take longer.
For subsequent optimizations use Gradient Descent with Constant Step Size. As you should have a good initial guess then, Gradient Descent should be able to converge against the local minimum. By eliminating line search we save a lot of computation cycles. Also Gradient Descent does not need to solve a linear system and thus does not perform SVD decomposition (as opposed to Gauss Newton). Also consider setting the constant step size < 1.0. This would slow down convergence, but makes it more robust against bad initial guesses. You could also use Gauss Newton with constant step size after the first initial guess is obtained, if gradient descent does not work on your objective function.

One note: for your error function you should use the plain difference (measurement - model_prediction), not the square of it. Least-Squares optimization algorithms take care of calculating the correct squared error and assume a non-squared error vector as input.

The jacobian - simplified - is the gradient for multi-dimensional functions. When you have a problem space of 8 unknowns and an error vector of 1000 results then the jacobian is a 1000x8 matrix , where each row is the gradient of the corresponding function of the error vector.
Providing the jacobian can increase the performance of the optimization process (it is the third parameter for your error functor). However at this stage I would discourage doing it:

  • manually computing the jacobian is highly error prone and might lead to hard-to-debug problems and side-effects
  • numerical estimation of the jacobian is linear in the size of your problem space and your problem space is tiny (8 unknowns), so performance gain might not be that great anyway

If you have access to a good scientific library you could also check if the following book is available: https://www.springer.com/de/book/9780387303031
My base knowledge about numerical optimization is based on it :D. And it's a good read anyway.

For your armijo backtracking configuration try the following:

lsq::ArmijoBacktracking(0.8, 1e-4, 1e-6, 1.0, 100);

Basically the decrease should be chosen not too low, to avoid divergence of the step size calculation. c1 should be chosen quite small, to relax the armijo condition and improve convergence (as you figured also figured out). Minimum step size should not be chosen too big, because otherwise gauss newton might not converge (as the minimum step size might be too big to achieve the convergence guarantee), but it should also not be too small, because then we might spend too much time for searching for a suitable step size. Maximum step size is usually chosen 1.0, because we want to accept a full step if it satisfies the armijo condition. And use a maximum number of iterations to avoid too many or near to infinite computations during line search: just to be save, you never know what numeric computations might do during real-live applications, if you have not fully analyzed your optimization problem (convexity of the objective, definiteness of the jacobian, and so on) ;).

@DerAlbiG
Copy link
Author

DerAlbiG commented Nov 8, 2020

Thank you again for your Input.

for your error function you should use the plain difference (measurement - model_prediction), not the square of it

I agree, however i need to fit 2 waveforms at once - they are measured at different excitation voltages. Only when fitting 2 waveforms, the problem has a unique solution.
And i cant add the plain differences, or equal bad matching could cancel itself - i get an estimate that is too high and one that is too low for both curves and the algorithm is as happy as it is wrong.
So I went for (error1^2 + error2^2) with some weight, to equalize the importance of both matches.
In hindsight, i may just alternate between error1 and error2 in the error vector as I have enough data anyway, right?

...I havent done that, because i simply dont know, if the error vector needs some sort of systematic behaviour to it?
I mean, suppose i scramble the order of the error (for every computation the same way) will that bother the algorithm?
How far do you think i can compress the error vector? The next logical thing is to add errors around the same region up to decrease the length of the error vector and reduce computational load.
I am basically fitting C + A*(1-exp(t/T)), so every section of the measurement data is characterized by 3 variables. This means taking the average error around 3 points on the curve sections (near begin, center, near end) should be enough. Do you see any mistake in my argument?

What bothers me about algorithm choice is that i had basically no good result with anything else than GaussNewton+Armijo. This either could be the ground truth, or I simply did not configure the other algorithms correctly. But without having had a real competition amongst algorithms, how can i be sure i used the right one. Strange.

Your Idea to apply different algorithms is viable for sure. Sounds great to be honest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants