-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster R package vignettes #266
base: main
Are you sure you want to change the base?
Commits on Nov 11, 2020
-
Convert API calls to request CSV format for data, instead of JSON
The CSV format is much more compact (does not repeat field names for every row), and more naturally fits with R anyway. Alter the relevant tests to serve CSVs. I've verified all vignettes build with these changes.
Configuration menu - View commit details
-
Copy full SHA for 91137da - Browse repository at this point
Copy the full SHA 91137daView commit details -
Configuration menu - View commit details
-
Copy full SHA for bcf0191 - Browse repository at this point
Copy the full SHA bcf0191View commit details -
Correct error in metadata test
It should not be possible to have two signals with the same source, signal, time_type, and geo_type. This will cause a query for that signal to have two metadata rows attached to the covidcast_signal data frame, which will confuse everything.
Configuration menu - View commit details
-
Copy full SHA for 60f6cc7 - Browse repository at this point
Copy the full SHA 60f6cc7View commit details -
Add an additional test of covidcast_signal
Fetching multiple days is important.
Configuration menu - View commit details
-
Copy full SHA for b10569f - Browse repository at this point
Copy the full SHA b10569fView commit details
Commits on Nov 12, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 677f4cd - Browse repository at this point
Copy the full SHA 677f4cdView commit details
Commits on Nov 13, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 37e8a07 - Browse repository at this point
Copy the full SHA 37e8a07View commit details -
Use dplyr::distinct in {latest,earliest}_issue
Profiling revealed that latest_issue was responsible for a large portion of the time taken in building correlation-utils.Rmd (apart from downloading the data). Much of this time was spent in dplyr::filter. Rather than grouping by geography and time, we can use dplyr::distinct, knowing that each geo_value and time_value should appear only once per issue date. By taking the first or last (after sorting by issue date), we get the desired result. dplyr does not document algorithmic details, so I can't easily give O(n) notation here. Algorithmic details notwithstanding, the results are extraordinary: > nrow(d) [1] 203360 > system.time(latest_issue_old(d)) user system elapsed 6.395 0.037 6.465 > system.time(latest_issue(d)) user system elapsed 0.025 0.003 0.027
Configuration menu - View commit details
-
Copy full SHA for 0c93efa - Browse repository at this point
Copy the full SHA 0c93efaView commit details -
Do our correlation analysis at the state, not county, level
Fetching the county data took a large portion of the time required to build the vignette, particularly after the fixes to latest_issue in b0f7e7b.
Configuration menu - View commit details
-
Copy full SHA for 7f7fd89 - Browse repository at this point
Copy the full SHA 7f7fd89View commit details -
Configuration menu - View commit details
-
Copy full SHA for bf86746 - Browse repository at this point
Copy the full SHA bf86746View commit details -
Fix source links in pkgdown documentation
By providing the `repo` block with a link pointing to R-packages/covidcast/, pkgdown can build the correct URLs.
Configuration menu - View commit details
-
Copy full SHA for 4e8b0ef - Browse repository at this point
Copy the full SHA 4e8b0efView commit details -
Switch all vignettes to use httptest to record API requests
Also modify several vignettes to download only the necessary amount of data, thus reducing the file size of these CSV files.
Configuration menu - View commit details
-
Copy full SHA for 0ebe6ed - Browse repository at this point
Copy the full SHA 0ebe6edView commit details -
Configuration menu - View commit details
-
Copy full SHA for 30a5c2f - Browse repository at this point
Copy the full SHA 30a5c2fView commit details