Skip to content
Kevin Rose edited this page Apr 6, 2016 · 15 revisions
Variable name Description Justification How to fill missing data
var_name DESCRIPTION THOUGHTS ON WHY IT MIGHT BE USEFUL HOW SHOULD MISSING DATA BE FILLED IN?
beach_direction North Beach or South a Beach? Manual Clustering of beach variable Navy pier appears to be a useful geographic split for clustering beaches with similar amounts of high E. coli levels NA
beach_cluster Like beach direction, but with more levels. Can be determined by inspection or through a clustering technique. Allows a model to more fine-tune estimates within clusters. NA
change_in_previous_reading_Dn Prior day minus its day n prior. (for n = 1, 2, 3) If bird poop is accumulating, a large increase in one reading may anti-correlate with future readings. TODO
days_since_last_substantial_rain Number of days since the last time the beach was cleaned. Again on the topic of accumulating bird poop. TODO
inches_rainfall_past_x Number of inches in the past x days (for x = .5, 1, 2, 3, 4, 7) Same TODO
historical daily weather summaries ... ... ...
hourly weather from recent history (5am day before through 4am day of?) ... ... ...
Cross-beach info (e.g. the E. coli readings of each of the groups, perhaps?) ... ... ...
sunrise time ... ... ...

2015 and Onwards

Looking ahead to more variables that are available only from 2015 onwards:

Variable name Description Justification How to fill missing data
water sensor data (turbidity, etc.) ... ... ...
USGS model prediction ... ... ...