Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

symfit stdev are NaN #340

Open
anna-zk opened this issue Oct 20, 2021 · 7 comments
Open

symfit stdev are NaN #340

anna-zk opened this issue Oct 20, 2021 · 7 comments

Comments

@anna-zk
Copy link

anna-zk commented Oct 20, 2021

Hi,
I have a problem with global fitting of two data sets (with four fitting parameters) with symfit Fit function, which gives me the fitting results, but all the standard deviations are nan.
Here is my code:


from symfit import parameters, variables, Fit, Model, exp
import numpy as np

xdata = np.array([0.25, 0.5, 0.8, 1, 1.5, 2.5, 4, 6])
ydata = np.array([0.013086325798209482, 0.010144832077553744, 0.00807480228676198, 0.007560555533483845, 0.0061854497435148105, 0.005099733409299026, 0.0038812123438669768, 0.0031145317486463066])
ydata1 = np.array([0.012624136137849127, 0.00884329432480036, 0.006747803674437807, 0.006389813737979039, 0.004918439951078863, 0.0035903310403010594, 0.002554518199002905, 0.0020277651510502066])

Ra, Rb, Ka, Kb = parameters('Ra, Rb, Ka, Kb')
L0, Rof, Rof1 = variables('L0, Rof, Rof1')

s = 1.3
P0 = 0.01

##Alfa
a = s*Kb + 2*Ka + (Ka**2/s*Kb)
b = -(L0*s*Kb + P0*s*Kb + Ka*P0 + Ka*L0 + Ka*Kb + Ka*Kb*s)
c = P0*L0*s*Kb
delta = b**2 - 4*a*c
sdelta = delta ** (1/2)
ca = (-b - sdelta)/(2*a)

##Beta
a1 = Ka + 2*Kb*s + (Kb**2*s**2)/Ka
b1 = -(L0*Ka + P0*Ka + Kb*s*P0 + Kb*s*L0 + Ka*Kb*s*(1+(1/s)))
c1 = P0*L0*Ka
delta1 = b1**2 - 4*a1*c1
sdelta1 = delta1 ** (1/2)
cb = (-b1 - sdelta1)/(2*a1)

model1 = Model({
        Rof: Ra * ca / L0,
        Rof1: Rb * cb / L0,
})

fit = Fit(model1, L0=xdata, Rof=ydata, Rof1=ydata1) 
fit_result = fit.execute()
print(fit_result)


(i know that my model is a bit complicated...)

And here is the results that I get:

RuntimeWarning: invalid value encountered in sqrt
return np.sqrt(self.variance(param))

Parameter Value Standard Deviation
Ka 8.842217e-01 nan
Kb 7.191712e-01 nan
Ra 2.699504e+00 nan
Rb 2.496971e+00 nan
Status message Optimization terminated successfully.
Number of iterations 41
Objective <symfit.core.objectives.LeastSquares object at 0x7f926fc8aa90>
Minimizer <symfit.core.minimizers.BFGS object at 0x7f926fc3c8d0>

Goodness of fit qualifiers:
chi_squared 6.016628549760758e-06
objective_value 3.008314274880379e-06
r_squared 0.9634526939817522

I will be grateful for any help!

@pckroon
Copy link
Collaborator

pckroon commented Oct 21, 2021

There could be many causes. Did you do a visual inspection of the resulting fit?

Does the resulting fit have a covariance matrix (fit_result.covariance_matrix)?
Can you get a hessian from the objective? (hess = fit_result.objective.eval_hessian(**fit_result.params)))
Can you invert the hessian matrix? (np.linalg.inv(hess))
This is probably where things start going wrong :) If your Hessian is singular/non invertible there is probably something wrong with your model. See also, for example, https://stats.stackexchange.com/questions/181515/what-does-hessian-is-singular-mean-in-sas-proc-nlin/181528

@anna-zk
Copy link
Author

anna-zk commented Oct 22, 2021

Indeed, there was a problem with my model, I found a mistake there, but after corrections the error remains. Here is the current code:

from symfit import parameters, variables, Fit, Model, exp
import numpy as np

xdata = np.array([0.25, 0.5, 0.8, 1, 1.5, 2.5, 4, 6])
ydata = np.array([0.01389280763166483, 0.010951313911009092, 0.008881284120217327, 0.008367037366939192, 0.006991931576970159, 0.005906215242754372, 0.004687694177322324, 0.003921013582101654])
ydata1 = np.array([0.013222900783408978, 0.009442058970360209, 0.007346568319997657, 0.006988578383538889, 0.005517204596638713, 0.004189095685860909, 0.003153282844562755, 0.002626529796610057])

Rbounda, Rboundb, Ka, Kb = parameters('Rbounda, Rboundb, Ka, Kb')
L0, Robsa, Robsb = variables('L0, Robsa, Robsb')

s = 1.3
P0 = 0.01
Rfreea = 0.0008064818334553477
Rfreeb = 0.0005987646455598504


a = s*Kb + 2*Ka + (Ka**2/(s*Kb))
b = -(L0*s*Kb + P0*s*Kb + Ka*P0 + Ka*L0 + Ka*Kb + Ka*Kb*s)
c = P0*L0*s*Kb
delta = b**2 - 4*a*c
sdelta = delta ** (1/2)
ca = (-b - sdelta)/(2*a)
La = s*((L0-ca*(1+(Ka/(s*Kb))))/(1+s))
Lb = La/s
cb = (Ka*ca)/(Kb*s)

model1 = Model({
        Robsa: Rfreea * La/(La+ca) + Rbounda * ca/(La+ca),
        Robsb: Rfreeb * Lb/(Lb+cb) + Rboundb * cb/(Lb+cb),
})


fit = Fit(model1, L0=xdata, Robsa=ydata, Robsb=ydata1) 
fit_result = fit.execute()

print(fit_result)

print('covariance_matrix', fit_result.covariance_matrix)
hess = fit_result.objective.eval_hessian(**fit_result.params)
print('hess', hess)
print('invert_hess', np.linalg.inv(hess))

And the current result, together with covariance matrix, hessian and inversion of the hessian, is below:

Parameter Value        Standard Deviation
Ka        4.494177e-01 5.808837e-02
Kb        2.406486e+04 nan
Rbounda   7.840503e-01 7.301278e-02
Rboundb   3.627573e+04 nan
Status message         Optimization terminated successfully.
Number of iterations   143
Objective              <symfit.core.objectives.LeastSquares object at 0x7f504ead96d0>
Minimizer              <symfit.core.minimizers.BFGS object at 0x7f5040539750>

Goodness of fit qualifiers:
chi_squared            5.875779424169528e-06
objective_value        2.937889712084764e-06
r_squared              0.9643082655120222
covariance_matrix [[ 3.37425824e-03  7.56349353e+04  3.41761415e-03  1.14015517e+05]
 [ 7.56349353e+04 -4.95053731e+11  7.70199754e+04 -7.49602721e+11]
 [ 3.41761415e-03  7.70199754e+04  5.33086667e-03  1.16104018e+05]
 [ 1.14015517e+05 -7.49602721e+11  1.16104018e+05 -1.13501425e+12]]
hess [[ 1.23903953e-03 -1.44730853e-08 -7.99195853e-04  9.60123981e-09]
 [-1.44730853e-08  6.40321239e-13  1.48229755e-13 -4.24344173e-13]
 [-7.99195853e-04  1.48229755e-13  7.85776782e-04  0.00000000e+00]
 [ 9.60123981e-09 -4.24344173e-13  0.00000000e+00  2.81214768e-13]]
invert_hess [[ 2.29706257e+03  5.14892952e+10  2.32657756e+03  7.76172884e+10]
 [ 5.14892952e+10 -3.37013149e+17  5.24321761e+10 -5.10300109e+17]
 [ 2.32657756e+03  5.24321761e+10  3.62904478e+03  7.90390580e+10]
 [ 7.76172884e+10 -5.10300109e+17  7.90390580e+10 -7.72673148e+17]]

I can see that some diagonal elements of the covariance matrix are negative, this is strange, right?
Also, the result is now strange, Ka and Kb differ by 5 orders of magnitude, while I would expect them to be similar (up to the factor of 10, maybe).

@anna-zk
Copy link
Author

anna-zk commented Oct 22, 2021

In my previous post there was some problem with code formatting, now it should be OK:

from symfit import parameters, variables, Fit, Model, exp
import numpy as np

xdata = np.array([0.25, 0.5, 0.8, 1, 1.5, 2.5, 4, 6])
ydata = np.array([0.01389280763166483, 0.010951313911009092, 0.008881284120217327, 0.008367037366939192, 0.006991931576970159, 0.005906215242754372, 0.004687694177322324, 0.003921013582101654])
ydata1 = np.array([0.013222900783408978, 0.009442058970360209, 0.007346568319997657, 0.006988578383538889, 0.005517204596638713, 0.004189095685860909, 0.003153282844562755, 0.002626529796610057])

Rbounda, Rboundb, Ka, Kb = parameters('Rbounda, Rboundb, Ka, Kb')
L0, Robsa, Robsb = variables('L0, Robsa, Robsb')

s = 1.3
P0 = 0.01
Rfreea = 0.0008064818334553477
Rfreeb = 0.0005987646455598504


a = s*Kb + 2*Ka + (Ka**2/(s*Kb))
b = -(L0*s*Kb + P0*s*Kb + Ka*P0 + Ka*L0 + Ka*Kb + Ka*Kb*s)
c = P0*L0*s*Kb
delta = b**2 - 4*a*c
sdelta = delta ** (1/2)
ca = (-b - sdelta)/(2*a)
La = s*((L0-ca*(1+(Ka/(s*Kb))))/(1+s))
Lb = La/s
cb = (Ka*ca)/(Kb*s)

model1 = Model({
        Robsa: Rfreea * La/(La+ca) + Rbounda * ca/(La+ca),
        Robsb: Rfreeb * Lb/(Lb+cb) + Rboundb * cb/(Lb+cb),
})


fit = Fit(model1, L0=xdata, Robsa=ydata, Robsb=ydata1) 
fit_result = fit.execute()

print(fit_result)

print('covariance_matrix', fit_result.covariance_matrix)
hess = fit_result.objective.eval_hessian(**fit_result.params)
print('hess', hess)
print('invert_hess', np.linalg.inv(hess))

@anna-zk
Copy link
Author

anna-zk commented Oct 22, 2021

Well, I don't know why inserting the code does not work for me now, I hope you will understand it anyway.

@pckroon
Copy link
Collaborator

pckroon commented Oct 22, 2021

Did you do a visual inspection of this fit? :) ALWAYS look at fits, don't be complacent and settle for e.g. R^2 [1].
It's most likely a numerical or convergence issue. The variance (diagonal of the covariances) being negative is indeed a very big red flag.

Given that you are surprised by k having such different values, do you have an ballpark idea for the value they should have? If so, set those as initial values (Ka.value=1e-2), same for the other 2 parameters. That should help your fit converge. Symfit comes with an interactive guess contrib module which might help.
Also consider adjusting the tolerance of the fit (execute(tol=1e-16)) if that doesn't help.
If all that doesn't help, consider multiplying your model and y data with a factor to get the values between 0.1 and 1. That will affect your covariances though...

See https://stackoverflow.com/questions/28702631/scipy-curve-fit-returns-negative-variance for more details.

[1] https://en.wikipedia.org/wiki/Anscombe%27s_quartet

PS. code should be in so-called backticks (`), next to the number 1 on your keyboard.

@anna-zk
Copy link
Author

anna-zk commented Oct 25, 2021

Here are the plots of the experimental points together with the fits:
file:///nhome/anna/Figure_1.png

The orange fit looks pretty fine, but the blue one deviates from data for larger 'x'.
Setting both Ka and Kb equal 0.5 (initially) almost doesn't change the results:
Parameter Value Standard Deviation
Ka 4.571113e-01 6.023765e-02
Kb 1.294094e+04 nan
Rbounda 7.918653e-01 7.541881e-02
Rboundb 1.932618e+04 nan

Still Kb is 10^5 larger than Ka.
Similar results I get while setting tol=1e-16 (then I get also a communicate "Desired error not necessarily achieved due to precision loss."
Multiplication of data (y) and model by a factor of 50 (so that my y values are now between 0.1 and 0.7) also does not change the results.

@pckroon
Copy link
Collaborator

pckroon commented Oct 27, 2021

Your initial guesses are not good (enough):

Initial values:
Ka: 0.75
Kb: 0.45
Rbounda: 2.0
Rboundb: 0.86

Parameter Value        Standard Deviation
Ka        4.528586e-01 1.697666e-01
Kb        7.983618e+03 3.629505e+06
Rbounda   7.875582e-01 1.820861e-01
Rboundb   1.198447e+04 5.449942e+06
Status message         Optimization terminated successfully.
Number of iterations   150
Objective              <symfit.core.objectives.LeastSquares object at 0x7fd83b92bca0>
Minimizer              <symfit.core.minimizers.BFGS object at 0x7fd83b92bf40>

Goodness of fit qualifiers:
chi_squared            5.874081979620764e-06
objective_value        2.937040989810382e-06
r_squared              0.9643185764402871
covariance_matrix [[ 2.88207112e-02 -4.25726822e+05  3.00104367e-02 -6.39425043e+05]
 [-4.25726822e+05  1.31733071e+13 -4.55215425e+05  1.97805923e+13]
 [ 3.00104367e-02 -4.55215425e+05  3.31553461e-02 -6.83706772e+05]
 [-6.39425043e+05  1.97805923e+13 -6.83706772e+05  2.97018707e+13]]
hess [[ 1.22108942e-03 -4.30622959e-08 -7.91110193e-04  2.86863288e-08]
 [-4.30622959e-08  5.79446561e-12  1.34790073e-12 -3.85987459e-12]
 [-7.91110193e-04  1.34790073e-12  7.78868903e-04  0.00000000e+00]
 [ 2.86863288e-08 -3.85987459e-12  0.00000000e+00  2.57118317e-12]]
invert_hess [[ 1.96256786e+04 -2.89901860e+11  2.04358310e+04 -4.35421259e+11]
 [-2.89901860e+11  8.97046189e+18 -3.09982344e+11  1.34697421e+19]
 [ 2.04358310e+04 -3.09982344e+11  2.25773806e+04 -4.65575234e+11]
 [-4.35421259e+11  1.34697421e+19 -4.65575234e+11  2.02257107e+19]]

Note that the errors I get for Kb and Rboundb are both very large, which should prompt you to take a critical look at your model.
If you want to limit parameters to specific value ranges (e.g. Rboundb <= 10), set minimum and maximum values (e.g. Rboundb.max = 10)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants