Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove or Improve background subtraction: Currently introducing a bias? #119

Open
fzeiser opened this issue Mar 30, 2020 · 3 comments
Open
Labels
Suggestion Suggestion for new feature/changes
Milestone

Comments

@fzeiser
Copy link
Collaborator

fzeiser commented Mar 30, 2020

This is a suggestion by Anders, as an alternative to the "remove negatives", see #116 & that we currently perform the on the background

Statistics, last paragraph in Section 4: I think we're introducing a slight bias by leaving out the negative-count bins in the background-subtracted spectra. That is, in our simulated spectra we accept statistical fluctuations in one direction (surisingly low background count and/or surprisingly large total count), but we exclude fluctuations in the opposite direction (high background count and/or low total count).

Would anything break in the math/code if we actually just included the negative-count bins in the fit? To be clear, I don't expect the impact to be large (perhaps not even noticable), so if it's technically challenging we may want to leave it as is.

Alternatively, I guess we could sample the total count (tot_i) first, and then sample the background count (bkg_i) repeatedly until we get a sample that satisifies bkg_i < tot_i -- so effectively sample the background count from a conditional distribution p(bkg_i | lambda_bkg, bkg_i < tot_i).
[I think we're encountering a classic statistics issue here: if the true value of some quantity X is close to zero, X < 0 is unphysical, and your individual estimates of X have a significant statistical uncertainty, you should expect some of your X estimates to get a central value in the X < 0 region. If you force each individual estimate to be X >= 0 (e.g. by leaving out the X < 0 estimates) and later combine your X estimates, your combined estimator will be biased towards high X values.]

@fzeiser fzeiser added the Suggestion Suggestion for new feature/changes label Mar 30, 2020
@fzeiser fzeiser added this to the Version 2.0 milestone Mar 30, 2020
@fzeiser
Copy link
Collaborator Author

fzeiser commented Mar 30, 2020

Somewhat along the same lines is then #28 and following comment

  1. Question about the chi^2 in Section 5: We say that "[...] most bins of the first-generation matrices follow a normal distribution". I assume it's the low-count bins that deviate most strongly from a normal distribution? I wonder if this might improve a bit if we include the negative-count bins in the fit (point 5 above)?
    [For the future: it could be interesting to try to replace the chi^2 with a log-liklihood function that also tries to account for the deviations from normal distributions.]

@fzeiser
Copy link
Collaborator Author

fzeiser commented Sep 8, 2020

In line with the comments by the referee we might just as well not (by default) cut away the negative counts etc. I'm not working on a branch to implement this.

If one still wishes to run a bg subtraction in the Ensemble class, one could for example use the action_raw, action_unfolded and action_firstgen attribute to apply it to the corresponding matrices.

fzeiser pushed a commit that referenced this issue Sep 8, 2020
Keep negative entries in Ensemble, Unfolder and Fristgen by default.
@fzeiser fzeiser changed the title Improved background subtraction Remove or Improve background subtraction: Currently introducing a bias? Sep 9, 2020
@fzeiser
Copy link
Collaborator Author

fzeiser commented Sep 9, 2020

See also #148 (comment) on another idea of how to avoid the bias.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Suggestion Suggestion for new feature/changes
Projects
None yet
Development

No branches or pull requests

1 participant