Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give user more flexibility in outlier detection by changing width of IQR window independently of median width? #420

Open
nmdefries opened this issue Feb 26, 2024 · 2 comments

Comments

@nmdefries
Copy link
Contributor

epiprocess::roll_iqr performs a rolling IQR calculation with the same window width as the median. In some cases, this doesn't filter out m/any values that obviously look like outliers.

Imagine a user is trying to remove outliers from a 7-day rolling average signal where the raw data is not available. Because of the rolling average smoothing, single outliers have been turned into large peaks/troughs that are fairly wide (~7 days), which makes them harder to remove. Making n larger (e.g. 50) would allow those larger peaks to be removed. However, that also makes the calculated rolling median change very slowly over time, which may be unsatisfying in cases where the time series is more dynamic.

Consider including an option for changing the window size used to calculate IQR independently of window size used to calculate median.

@nmdefries
Copy link
Contributor Author

@nmdefries
Copy link
Contributor Author

In some cases, a workaround could be to do the outlier removal before doing any smoothing. The outliers should stick out more in the raw data so they'd be easier to detect.

From Jeremy's use case, doing that didn’t work on all states because some had an IQR of 0, due to low counts and/or unusual reporting; this made outlier removal get rid of almost everything. Is this an uncommon use case/is outlier detection inappropriate here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant