Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor claims_hosp to use smoothing util #433

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

chinandrew
Copy link
Contributor

Closes #432

Summary of Changes:

  • Update claims_hosp to use new smoothing utils
  • Update test to smooth on test data

@chinandrew
Copy link
Contributor Author

blocked by #437

@krivard krivard added the blocked This task is waiting for completion of another task label Nov 5, 2020
@krivard krivard added this to In progress in Engineering via automation Nov 5, 2020
@krivard krivard removed the blocked This task is waiting for completion of another task label Nov 10, 2020
@krivard
Copy link
Contributor

krivard commented Nov 11, 2020

@chinandrew psst -- conflicts

@chinandrew
Copy link
Contributor Author

resolved

Copy link
Member

@mariajahja mariajahja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand, the savgol smoother is not meant to be equivalent to the left gauss filter previously used. The results don't line up, but do match up to 0.1 tolerance. I ran using Smoother.left_gauss_linear_smoother from the smoother utility, and those results line up, so we're changing the smoothing method (and not just moving the function out).

Sorry for the delay, the PR looks good. @krivard are we going to announce this change? The documentation as well?

Engineering automation moved this from In progress to Reviewer approved Nov 30, 2020
@chinandrew
Copy link
Contributor Author

Just to make sure I understand, the savgol smoother is not meant to be equivalent to the left gauss filter previously used. The results don't line up, but do match up to 0.1 tolerance. I ran using Smoother.left_gauss_linear_smoother from the smoother utility, and those results line up, so we're changing the smoothing method (and not just moving the function out).

Sorry for the delay, the PR looks good. @krivard are we going to announce this change? The documentation as well?

My understanding is that they should be very close with some slight differences in parameters, but I haven't spent a whole lot of time digging into details. See this slack post.

@krivard
Copy link
Contributor

krivard commented Nov 30, 2020

This set of refactors is meant only to switch indicators to the smoother utility, and must not make any substantive, announcement-worthy changes to any signals.

@chinandrew please switch to the utility version of left gauss linear. We'll handle the conversion to savgol in a separate, statistically-rigorous effort.

@chinandrew
Copy link
Contributor Author

This set of refactors is meant only to switch indicators to the smoother utility, and must not make any substantive, announcement-worthy changes to any signals.

@chinandrew please switch to the utility version of left gauss linear. We'll handle the conversion to savgol in a separate, statistically-rigorous effort.

@dshemetov would know for sure, but I think savgol is a generalization of the gaussian linear regression (i.e. savgol with polyfit=1 == left gauss linear). It might be the window length that leads to the differences, which we could adjust if needed.

@dshemetov
Copy link
Contributor

For clarification, we have savgol polyfit=1 ~= left_gauss_linear. The small differences arise because savgol truncates the fitting window, while left_gauss_linear uses the entire past, but adds negligible weights to anything beyond the latest 2 weeks. Technically speaking, @mariajahja is right that the underlying signal does change by using savgol with polyfit=1. The difference is on the order of 0.1 for signals with a magnitude in the hundreds, however. That did not seem like an announcement-worthy change to the signal to me. I may be wrong, of course.

@mariajahja
Copy link
Member

I'm OK either way, but we'd need to change the documentation at the least if we go with savgol. The difference of 0.1 is on the percentage scale, so it isn't really a huge change, but because we do a second thresholding step after smoothing, there were a handful of counties and msas that were actually removed/added on some dates due to the change. I don't see this as an issue, but would be good to document.

@dshemetov
Copy link
Contributor

Good to know about the thresholding. Definitely something worth documenting. @krivard I'm good with whatever you think is best here.

@krivard
Copy link
Contributor

krivard commented Dec 2, 2020

Removes won't actually propagate into the API unless we manually trash all issues of that region-date-value, because we still don't have a missingness/deletion encoding system (it's on deck for December...). I'm not too concerned with the ethics of that (since they're all borderline cases anyhow) but if something comes up that requires code forensics we'll need to remember that the API output and the indicator output won't necessarily match for issues before the release of this change.

Probably safest to announce it, especially since it will include an update to the docs. That makes this process a bit more fiddly, especially since claims_hosp isn't hooked up to Automation yet. Here are two options:

  • Merge now (after bringing this branch up to date with main), but have @mariajahja hold off on deploying to her production environment until we can draft and review the changes to the docs and the announcement
  • Wait to merge until all the comms are ready to go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Engineering
Reviewer approved
Development

Successfully merging this pull request may close these issues.

claims_hosp - smoothing refactor
4 participants