Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KDE shapes over MultiSpaces are crashing #430

Open
schmitse opened this issue Nov 22, 2022 · 1 comment
Open

KDE shapes over MultiSpaces are crashing #430

schmitse opened this issue Nov 22, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@schmitse
Copy link
Contributor

Creating a kernel density estimation from a dataset using a MultiSpace is currently not working.

Current Behaviour

When creating a KDE from over a MultiSpace the constructor throws an error due to MultiSpaces not having the limits attribute, which is required for the padding of the data:

        data, size, weights = self._convert_init_data_weights_size(
            data, weights, padding=padding, limits=obs.limits
        )

. I attached a minimal (not) working example of the issue:

import zfit
import numpy as np
obs1 = zfit.Space('obs', limits=(0., 0.25))
obs2 = zfit.Space('obs', limits=(0.75, 1.))
obs = obs1 + obs2

gen = np.random.default_rng(seed=1337)
data = gen.exponential(1, size=(1000,))
data_zfit = zfit.Data.from_numpy(obs=obs, array=data)

kde = zfit.pdf.KDE1DimExact(data=data_zfit, obs=obs, name='test')

When running this you get the following error:

Traceback (most recent call last):
  File "/afs/cern.ch/work/s/schmitse/ewp-rkstz/analysis/python/schmitse/mva_plots/python/MWE.py", line 14, in <module>
    kde = zfit.pdf.KDE1DimExact(data=data_zfit, obs=obs, name='test')
  File "/afs/cern.ch/work/s/schmitse/ewp-rkstz/analysis/python/schmitse/zfit/zfit/lib/python3.9/site-packages/zfit/models/kde.py", line 911, in __init__
    data, weights, padding=padding, limits=obs.limits
  File "/afs/cern.ch/work/s/schmitse/ewp-rkstz/analysis/python/schmitse/zfit/zfit/lib/python3.9/site-packages/zfit/core/space.py", line 133, in wrapped_func
    return func(*args, **kwargs)
  File "/afs/cern.ch/work/s/schmitse/ewp-rkstz/analysis/python/schmitse/zfit/zfit/lib/python3.9/site-packages/zfit/core/space.py", line 2727, in limits
    self._raise_limits_not_implemented()
  File "/afs/cern.ch/work/s/schmitse/ewp-rkstz/analysis/python/schmitse/zfit/zfit/lib/python3.9/site-packages/zfit/core/space.py", line 3069, in _raise_limits_not_implemented
    raise MultipleLimitsNotImplemented(
zfit.util.exception.MultipleLimitsNotImplemented: Limits/lower/upper not implemented for MultiSpace. This error is either caught automatically as part of the codes logic or the MultiLimit case should be considered. To do that, simply iterate through the MultiSpace, which returns a simple space. Iterating through a Spaces also worksfor simple spaces.

Context (Environment)

  • zfit version: 0.10.1
  • Python version: 3.9.13
  • Are you using conda, pipenv, etc? : conda
  • Operating System: centos7
  • Tensorflow version: 2.9.1

Possible Solution/Implementation

KDEs over MultiSpaces should either raise a NotImplementedError or the padding of the input dataset should be reflected around the edges of the subspaces of the MultiSpace.

A quick fix to make the KDEs work again would be to pass the observable space limits only if the Space has the limits attribute:

        # MultiSpaces have Attribute spaces, if MultiSpace dont pass limits. 
        _limits = None if hasattr(obs, 'spaces') else obs.limits
        data, size, weights = self._convert_init_data_weights_size(
            data, weights, padding=padding, limits=_limits
        )
@schmitse schmitse added the bug Something isn't working label Nov 22, 2022
@jonas-eschle
Copy link
Contributor

jonas-eschle commented Apr 5, 2024

Hey @schmitse , coming back to this: instead of using the (overcomplicated) MultiSpace in zfit, the newest developments rather try to get rid of it, but instead, a "TruncatedPDF" can be used. I think the above case could be best solved.

import zfit
import numpy as np


obs1 = zfit.Space('obs', limits=(0., 0.25))
obs2 = zfit.Space('obs', limits=(0.75, 1.))
obs = obs1 + obs2
obsall= zfit.Space('obs', 0, 1)
gen = np.random.default_rng(seed=1337)
data = gen.exponential(1, size=(1000,))
data_zfit = zfit.Data.from_numpy(obs=obsall, array=data)

kde = zfit.pdf.KDE1DimExact(data=data_zfit, obs=obsall, name='test')
kdetrunc = kde.to_truncated([obs1, obs2])

# plot
import matplotlib.pyplot as plt
x = np.linspace(0, 1, num=273)
plt.plot(x, kdetrunc.pdf(x),  label='trunc')
plt.plot(x, kde.pdf(x), "--", label='kde',)
plt.legend()
plt.show()

what do you think?
(yes, the normalization changes, but this should not matter for pdf, as we don't care about the absolute norm, for extended, this will by default be adjusted accordingly)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants