Skip to content
This repository has been archived by the owner on Mar 27, 2023. It is now read-only.

write up design principles for hoad #249

Open
maxheld83 opened this issue Jul 15, 2020 · 1 comment
Open

write up design principles for hoad #249

maxheld83 opened this issue Jul 15, 2020 · 1 comment

Comments

@maxheld83
Copy link
Contributor

maxheld83 commented Jul 15, 2020

it just occurred to me during the call with @kjgarza that it might be a good idea to write down the draft design principles for hoad that we've been talking about.

There are three levels of user/target segmentation, which correspond to three levels of our code.

  1. Distributed in-memory database.
    This database should be as generic as possible, in the extreme case just duplicating the crossref coverage, but with a lot better performance and arbitrary SQL/dplyr queries.
    • Target: Analysts (us).
    • Code:
      • setup of the database (currently Google BigQuery, maybe Azure Synapse)
      • batch jobs to seed the db with dumps and incremental updates
      • example queries
  2. Domain-specific APIs
    Opinionated queries against 1 to yield domain-specific data objects (that fit into laptop memories).
    A set of (multiple!) tidy data frames that make sense for hybrid open access uptake analysis, i.e. make it possible to run the plots/analyses in 3.
    • Target: R users interested in hybrid OA.
    • Code:
      • dplyr/sql queries against 1
      • additional on-client data wrangling
      • assertions and tests
  3. Dashboard
    Views on the data in 2 to tell answer our business questions.
    • Target: HOAD project stakeholders
    • Code:
      • plots (those are also part of the package proper)
      • dashboard (maybe modules are also part of the package)
@maxheld83
Copy link
Contributor Author

this is just quickly jotted down, should be in the repo somewhere

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant