Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be more defensive in earliest_issue and latest_issue, export them #282

Closed
capnrefsmmat opened this issue Nov 14, 2020 · 0 comments
Closed

Comments

@capnrefsmmat
Copy link
Contributor

These are not exported, but could be useful to users, particularly those working with multiple issue dates.

Before we do so, however, we should address Ryan's comments:

  • This will return a covidcast_signal data frame with scrambled rows. Would it be worth reordering? This doesn't affect any of our downstream uses like plotting or correlations, but may confuse a user if they call latest_issue() directly.

  • As far as I understand, latest_issue() doesn't check the class? So it could in principle be applied to a covidcast_signal_long data frame? And in this case it wouldn't return the right result, since you're masking by geo_value and time_value in the call to dplyr::distinct(). Thus I would either check the class in latest_issue() ... or include data_source and signal as masking variables.

For our current uses, neither of these is a problem, but we should fix them before exporting.

I'm tempted to say that reordering the rows is fine -- at least if we document the return order, so users aren't too surprised -- but the second point is definitely a footgun we should fix.

capnrefsmmat added a commit that referenced this issue Nov 15, 2020
Fixes #282 by ensuring the class is checked and by including data_source
and signal in dplyr::distinct, and by documenting that the output is
reordered.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant