Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support lsqfit.Multifitter objects #10

Open
millernb opened this issue Oct 8, 2021 · 10 comments · Fixed by #32
Open

Support lsqfit.Multifitter objects #10

millernb opened this issue Oct 8, 2021 · 10 comments · Fixed by #32
Labels
Low priority An issue that can be tackled eventually, maybe question Further information is requested

Comments

@millernb
Copy link
Collaborator

millernb commented Oct 8, 2021

Background

See here for documentation on the Multifitter class, though more insight can be gleaned from the git source on lsqfit._extras.unchained_nonlinear_fit.

Multifitter support is desirable for correlator as well as chiral fits. (However, it's worth noting that we technically could avoid using Multifitter by manually chaining fits.)

Proposed example

As a test case, we should try to support the example from the lsqfit documentation.

import numpy as np
import gvar as gv
import lsqfit

from lsqfitgui import run_server


class Linear(lsqfit.MultiFitterModel):
    def __init__(self, datatag, x, intercept, slope):
        super(Linear, self).__init__(datatag)
        # the independent variable
        self.x = np.array(x)
        # keys used to find the intercept and slope in a parameter dictionary
        self.intercept = intercept
        self.slope = slope

    def fitfcn(self, p):
        try:
            return p[self.intercept] + p[self.slope] * self.x
        except KeyError:
            # slope parameter marginalized/omitted
            return len(self.x) * [p[self.intercept]]

    def buildprior(self, prior, mopt=None):
        " Extract the model's parameters from prior. "
        newprior = {}
        newprior[self.intercept] = prior[self.intercept]
        if mopt is None:
            # slope parameter marginalized/omitted if mopt is not None
            newprior[self.slope] = prior[self.slope]
        return newprior

    def builddata(self, data):
        " Extract the model's fit data from data. "
        return data[self.datatag]


def main():

    data = gv.gvar(dict(
        d1=['1.154(10)', '2.107(16)', '3.042(22)', '3.978(29)'],
        d2=['0.692(10)', '1.196(16)', '1.657(22)', '2.189(29)'],
        d3=['0.107(10)', '0.030(16)', '-0.027(22)', '-0.149(29)'],
        d4=['0.002(10)', '-0.197(16)', '-0.382(22)', '-0.627(29)'],
        ))

    models = [
    Linear('d1', x=[1,2,3,4], intercept='a', slope='s1'),
    Linear('d2', x=[1,2,3,4], intercept='a', slope='s2'),
    Linear('d3', x=[1,2,3,4], intercept='a', slope='s3'),
    Linear('d4', x=[1,2,3,4], intercept='a', slope='s4'),
    ]

    prior = gv.gvar(dict(a='0(1)', s1='0(1)', s2='0(1)', s3='0(1)', s4='0(1)'))

    fitter = lsqfit.MultiFitter(models=models)
    fit = fitter.lsqfit(data=data, prior=prior)

    run_server(fit)


if __name__ == "__main__":
    main()

Difficulties

Difficulty 1: fit.fcn

Unfortunately, Multifitter fits lack many of the attributes we currently assume a fit has. To delve into this a bit more, consider the get_fit_bands method,

# from /lsqfitgui/plot/until.py

def get_fit_bands(fit):
    """Get x, y_min, y_mean, y_max values for fit."""
    try:
        if isinstance(fit.x, dict):
            x = {
                key: np.linspace(val.min(), val.max(), 100)
                for key, val in fit.x.items()
            }
        else:
            x = np.linspace(fit.x.min(), fit.x.max(), 100)
    except Exception:
        x = fit.x
    y = fit.fcn(x, fit.p)
    m = gv.mean(y)
    s = gv.sdev(y)
    return x, m - s, m, m + s

and more specifically, consider fit.fcn. A Multifitter fit also contains a fit.fcn, but it is called slightly differently (fit.fcn(fit.p) -- no x) and has a slightly different return (a dict instead of an array) . For example. from the Linear model example above, fit.fcn(fit.p) returns

{'d1': array([1.1497(58), 2.0982(80), 3.047(12), 3.995(17)], dtype=object),
 'd2': array([0.6938(58), 1.1865(80), 1.679(12), 2.172(17)], dtype=object),
 'd3': array([0.1164(58), 0.0317(80), -0.053(12), -0.138(17)], dtype=object),
 'd4': array([0.0011(58), -0.1990(80), -0.399(12), -0.599(17)], dtype=object)}

This is probably the simplest problem to correct, as it only requires a switch depending on whether the fit is derived from the Multifitter class or not; but digging a little deeper, we see that this leads to a more onerous difficulty.

Difficulty 2: fit.x

The more pressing problem is fit.x, which is currently used to construct the domain of our fit plots. A Multifitter fit does not have a standardized fit.x (technically it does have a fit.x, but it is initialized to False). Rather, the analogous x depends on your model. A couple examples:

  1. In the Linear example above, fit.x would most similar to x inside a Linear object. We actually can get the values of x from the fit object, but the call is rather obtuse: fit.fitter_args_kargs[1]['models'][k].x for model k. Moreover, each linear model needn't share the same values x (although they do in this example), so we would need to get the max/min values of x for each model, then get the max/min values of x from that list.

  2. Correlator models are basically the same as Linear models as far as x analogues are concerned. However, there are a couple ambiguities here: (1) we don't need to call it x (in my BaryonCorrelatorModel, for example, I call it t). (2) As an example of a more troublesome edge case, note that since we fit a range of t, one could alternatively formulate a BaryonCorrelatorModel as having a t_start and t_end rather than an array of times; that is, our x variable needn't even directly describe an array.

  3. By far the most challenging type of fit to implement will be chiral fits. Unlike the previous two examples, chiral fits have multiple independent variables (lattice spacing, light quark mass, etc) and non-standardized names (a, eps_a, eps2_a, etc). Further, these aren't proper independent variables (in the typical least-squares fit use case) since they have uncertainty. Of course, in practice we account for this fact by moving the "x-data" into the prior also, but this means we will have to specify which parts of the priors should be varied when constructing the plots; thus chiral fits require tracking correlated priors in lsqfit-gui.

Difficulty 3: observables with different scales/dimensions

Fitted observables might or might not be comparable on the same plot. For example, effective masses usually can be compared, but wave function overlaps often cannot; it may occasionally make sense to compare M_proton and M_Delta on a plot, but it would never make sense to plot M_proton and F_K/F_pi together.

Therefore must be able to specify which observables should be included in the same plot.

Proposal

I propose that meta_config should contain information about the x-variable(s) and plot(s). Let me list a few hang-ups that should be addressed eventually but which I'll ignore for now:

  1. different observables having different, incompatible x-variables;
  2. an x-variable not directly describing an array; and
  3. the chiral fits entirely (these are just too complicated for a first-pass -- besides everything mentioned above, the chiral fits also require shifting the x-data depending on the fit);

As an example for how we might implement the x and plotting limitations for a Multifitter model, let's consider the Linear model above and a baryon correlator model.

1. Linear model

For the linear model, the only thing that's from our fit object is the x-variable for our plots. We might write

meta_config:
   plots:
      fcn:
         xdata: x # x-variables of the MultifitterModel-derived model

We could imagine repeating this adding a fifth line to our simultaneous fits except with data out near x=1000. In this case, we probably would want to plot lines 1-4 together but line 5 separately. We might write

meta_config:
   plots:
      fcn:
         xdata: x
         separate: [['d1', 'd2', 'd3', 'd4'], ['d5']]

Here dN corresponds to the datatag for each MultifitterModel.

As an additional suggestion, it might be convenient to be able to manually group/ungroup observables in the web UI.

2. Correlator model

For reference, consider the example BaryonCorrelatorModel below.

# This class is needed to instantiate an object for lsqfit.MultiFitter
# Used for particles that obey fermi-dirac statistics
class BaryonCorrelatorModel(lsqfit.MultiFitterModel):
    def __init__(self, datatag, t, param_keys, n_states):
        super(BaryonCorrelatorModel, self).__init__(datatag)
        # variables for fit
        self.t = np.array(t)
        self.n_states = n_states
        self.sink = datatag.split('_')[-1]

        # keys (strings) used to find the wf_overlap and energy in a parameter dictionary
        self.param_keys = param_keys

    def fitfcn(self, p, t=None):
        if t is None:
            t = self.t
        t = np.array(t)
        
        wf = p[self.param_keys['wf']]
        E0 = p[self.param_keys['E0']]
        if self.n_states > 1:
            dE = np.exp(p[self.param_keys['log(dE)']])

        output = wf[0] * np.exp(-E0 * t)
        for j in range(1, self.n_states):
            E_j = E0 + np.sum([dE[k] for k in range(j)], axis=0)
            output = output + wf[j] * np.exp(-E_j * t)

        return output

    def buildprior(self, prior, mopt=None, extend=False):
        # Extract the model's prior from prior.
        return prior

    def builddata(self, data):
        # Extract the model's fit data from data.
        # Key of data must match model's datatag!
        return data['two_pt'][self.sink][self.t]

    def fcn_effective_mass(self, p, t=None):
        if t is None:
            t=self.t
        t = np.array(t)
        
        return np.log(self.fitfcn(p, t) / self.fitfcn(p, t+1))

    def fcn_effective_wf(self, p, t=None):
        if t is None:
            t=self.t
        t = np.array(t)
        
        return np.exp(self.fcn_effective_mass(p, t)*t) * self.fitfcn(p, t)

Assume I have two models named PS and SS corresponding to the two differently smeared sets of correlator data. Plotting fitfcn is unlikely to offer insight into the goodness-of-fit. Instead we'd like to consider the effective mass/wf plots.

Since fnc_effective_mass and fcn_effective_wf are defined in BaryonCorrelatorModel, we can access these functions through fit with fit.fitter_args_kargs[1]['models'][0].fnc_effective_mass etc. We'd like to plot the effective masses together but not the effective wf overlaps.

In this case, we might write

meta_config:
   plots:
      fcn:
         xdata: t
         separate: ['PS', 'SS']
      fcn_effective_mass:
         xdata: t
      fcn_effective_wf:
         xdata: t
         separate: ['PS', 'SS']
@ckoerber
Copy link
Owner

ckoerber commented Oct 8, 2021

Thanks, @millernb, for the suggestions and proposed solutions.

That is indeed a relevant but also complicated feature which would be useful to have. I feel that this requires changing significant part of the API, but it's also at an early enough point to start doing so. We should, at some point chat about the details and also chat with @walkloud about when things for the core fitter should be usable.

May I ask: what is the difference between using MultiFitterModel and returning multiple data values.

def fcn(x, p):
    ...
    return {"key1": ..., "key2": ...} 

Is it just "a cleaner API" or is there a more fundamental difference.

Before I going into some thoughts I have, I'd like to mention two guiding principles I'd like to follow:

  1. Keep things easy to use for users: i.e., if no fancy models are used, just run one function and it works
  2. If wrapping around external API (lsqfit), make sure to utilize "stable" API (i.e., only public methods and long time supported objects) to ensure low maintenance and longtime support.

Details thought

this is also to remind myself at a later point

This actually brings me to a more fundamental discussion I had with @walkloud (and someone else a year ago). In principle the fit.x variable can be completely unrelated to the fit.y variable. So using fit.x to construct fit.y makes strong assumptions and I am happy to move away from this, if there is a more consistent approach.

There is also some discrepancy when trying to conceptually discriminate dependent variable and meta arguments like, the number of exponents as an input for the fit function which I kind of hacked into the fits using the new meta attribute.

In other words, I really like the idea of tagging data, initializing fit functions, and just using fit.fcn(prior).
This goes along the notions of the MultiFitModel class--it seems to me that something along those lines has a nice API.

This obviously means that we have to modify the way the fit GUI is set up and configured (which you address in your proposal). I would certainly like to avoid to have too many if clauses and different APIs depending on specific scenarios. I.e., differences between

fit1 = nonlinear_fit(...)
fit2 = MultiFitterModel(...).lsqfit(...)

So I might actually favor, after some additional thoughts, following the MultiFitterModel and providing an API to cast a nonlinear_fit into a MultiFitterModel.

This may simplify the meta_config set up you propose (as in, the user has it in their python script if they care to set it up and if not, we automate it).

Regarding the customizations of plots: In the long run, I envision some custom "grab all the elements available in the content and drag and drop windows as you like them in the GUI." This certainly only works for distinguished figures so one needs some way to let the API know how to look them up. Your suggestions here looks reasonable.

@millernb
Copy link
Collaborator Author

millernb commented Oct 8, 2021

Okay, I was able to recreate the Linear example without using lsqfit.MultiFitter.

import lsqfit
import gvar as gv
import numpy as np

x = np.array([1,2,3,4])

y = dict(
    d1=['1.154(10)', '2.107(16)', '3.042(22)', '3.978(29)'],
    d2=['0.692(10)', '1.196(16)', '1.657(22)', '2.189(29)'],
    d3=['0.107(10)', '0.030(16)', '-0.027(22)', '-0.149(29)'],
    d4=['0.002(10)', '-0.197(16)', '-0.382(22)', '-0.627(29)'],

)
prior = gv.gvar(dict(a='0(1)', s1='0(1)', s2='0(1)', s3='0(1)', s4='0(1)'))

def fitfcn(x, p):
    output = {}
    for key in ['d1', 'd2', 'd3', 'd4']:
        slope = p['s%s'%(key[1:])]
        output[key] = p['a'] + slope *x
    return output

fit = lsqfit.nonlinear_fit(data=(x, y), fcn=fitfcn, prior=prior)
print(fit)

This output is identical to that of the MultiFitter example above.

@millernb
Copy link
Collaborator Author

millernb commented Oct 8, 2021

Here's a more practical extension of the Linear example where the x data differ.

import lsqfit
import gvar as gv
import numpy as np

x = dict(
    d1=np.array([1, 2, 3, 4]),
    d2=np.array([1, 2, 3, 4]),
    d3=np.array([1, 2, 3, 4]),
    d4=np.array([1, 2, 3, 4]),
    d5=np.array([5, 6, 7, 8])
)

y = dict(
    d1=['1.154(10)', '2.107(16)', '3.042(22)', '3.978(29)'],
    d2=['0.692(10)', '1.196(16)', '1.657(22)', '2.189(29)'],
    d3=['0.107(10)', '0.030(16)', '-0.027(22)', '-0.149(29)'],
    d4=['0.002(10)', '-0.197(16)', '-0.382(22)', '-0.627(29)'],
    d5=['1.869(10)', '2.198(16)', '2.502(22)', '2.791(29)']
)
prior = gv.gvar(dict(a='0(1)', s1='0(1)', s2='0(1)', s3='0(1)', s4='0(1)'), s5='0(1)')

def fitfcn(x, p):
    output = {}
    for k in x:
        slope = p['s%s'%(k[1:])]
        output[k] = p['a'] + slope *x[k]

    return output

fit = lsqfit.nonlinear_fit(data=(x, y), fcn=fitfcn, prior=prior)
print(fit)

@millernb
Copy link
Collaborator Author

millernb commented Oct 8, 2021

Interestingly, this actually will run if you wrap a run_server around the fit, but the fit plot looks like nonsense.

@ckoerber
Copy link
Owner

Just copied your script. Indeed the regular plots look weird, but the residuals look fine. I'll check what's going wrong.

ckoerber added a commit that referenced this issue Oct 11, 2021
@ckoerber
Copy link
Owner

Ok, this should be fixed now.

@ckoerber
Copy link
Owner

Is this infrastructure sufficient for now to be used for the fits you are doing (with some wrapping around)?

If so, I would probably like to shift the conversation about chained fits and MultiFitterModels to a later point.

@ckoerber ckoerber added the question Further information is requested label Oct 11, 2021
@millernb millernb added the Low priority An issue that can be tackled eventually, maybe label Oct 11, 2021
@millernb
Copy link
Collaborator Author

Yes, I think this is fine -- I'm convinced that MultiFitter isn't actually necessary; I'll just need to rewrite some of my code later. I've added a "Low priority" label to reflect the diminished importance, but feel free to close the issue instead.

A couple closing thoughts:

(1) Regarding chained fits, it's probably relevant to point out that the MultiFitter class also supports chained fits, eg

fitter = lsqfit.MultiFitter(models=models)
fit = fitter.chained_lsqfit(data=data, prior=prior)

But of course this can obviously be implemented without appealing to MultiFitter.

(2) I think, if we someday do want to support MultiFitter model, it would probably be best to just explicitly pass the models (and maybe also the xdata), eg run_server(fit=fit, models=models, xdata=xdata). It shouldn't be too tricky to infer whether xdata is an array or a dict of arrays. Likewise it shouldn't be too much of a challenge to determine whether those are arrays of gvars (and therefore belong in the prior) or arrays of floats.

@ckoerber
Copy link
Owner

Thanks, @millernb.

I agree with your second point and like to keep this issue open as a reminder (if more time or resources in form of volunteers are available :))

@ckoerber ckoerber linked a pull request Oct 28, 2021 that will close this issue
@ckoerber
Copy link
Owner

ckoerber commented Nov 4, 2021

With #32, I have now added a feature that the FitGUI can now be initialized with unchained_nonlinear_fit objects. Technically, it should also be possible with chained objects but I have not considered other possibilities and likely, there is not yet a full support for all features.

The key idea is that the unchained_nonlinear_fit is converted to a nonlinear_fit object at the initialization of the fit or any time generate_fit produces a new fit. This is done by lsqfit_from_multi_model_fit in lsqfitgui.util.models.

If we intend to extend this capability, this would be the right place.

@ckoerber ckoerber mentioned this issue Nov 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Low priority An issue that can be tackled eventually, maybe question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants