You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
epiprocess::roll_iqr performs a rolling IQR calculation with the same window width as the median. In some cases, this doesn't filter out m/any values that obviously look like outliers.
Imagine a user is trying to remove outliers from a 7-day rolling average signal where the raw data is not available. Because of the rolling average smoothing, single outliers have been turned into large peaks/troughs that are fairly wide (~7 days), which makes them harder to remove. Making n larger (e.g. 50) would allow those larger peaks to be removed. However, that also makes the calculated rolling median change very slowly over time, which may be unsatisfying in cases where the time series is more dynamic.
Consider including an option for changing the window size used to calculate IQR independently of window size used to calculate median.
The text was updated successfully, but these errors were encountered:
In some cases, a workaround could be to do the outlier removal before doing any smoothing. The outliers should stick out more in the raw data so they'd be easier to detect.
From Jeremy's use case, doing that didn’t work on all states because some had an IQR of 0, due to low counts and/or unusual reporting; this made outlier removal get rid of almost everything. Is this an uncommon use case/is outlier detection inappropriate here?
epiprocess::roll_iqr
performs a rolling IQR calculation with the same window width as the median. In some cases, this doesn't filter out m/any values that obviously look like outliers.Imagine a user is trying to remove outliers from a 7-day rolling average signal where the raw data is not available. Because of the rolling average smoothing, single outliers have been turned into large peaks/troughs that are fairly wide (~7 days), which makes them harder to remove. Making
n
larger (e.g. 50) would allow those larger peaks to be removed. However, that also makes the calculated rolling median change very slowly over time, which may be unsatisfying in cases where the time series is more dynamic.Consider including an option for changing the window size used to calculate IQR independently of window size used to calculate median.
The text was updated successfully, but these errors were encountered: