Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Class representing observed data #40

Open
ghost opened this issue Mar 21, 2024 · 3 comments
Open

Class representing observed data #40

ghost opened this issue Mar 21, 2024 · 3 comments

Comments

@ghost
Copy link

ghost commented Mar 21, 2024

Goal

The module lacks a standardized way to process/represent observed data. We should have a class (implemented as an independent module) for processing epi data to facilitate the implementation of the package.

Context

This becomes clear as soon as we start working with time series data.

Required features

  • A class that reads in a dataset.
  • Possibly store the data using either polars.DataFrame or a named tuple of jax.numpy.Array.
  • The getter functions should raise exceptions when the user tries to pull some epi data.
  • The class should have a function (possibly a static method) to process the desired data.
  • Possibly implement a metaclass so each model could have their way to address data needs.

Specifications

TBD

Out of scope

  • None noted

Related documents

  • ADR on CDCent side (link)
@ghost
Copy link
Author

ghost commented Mar 21, 2024

Cross-posted from teams; I see three possible routes:

  • Have no class at all; leave data processing to the user.
  • Have a single generic class flexible enough to handle all cases.
  • Have a metaclass called EpiData that can be used as a baseline for each model (so models can include their own data needs), thus: HospitalizationsData(EpiData), WastewaterData(EpiData), etc

I like the third option, as we can always have a default implementation of the metaclass that models can use if there's no need for a specialized implementation.

@dylanhmorris
Copy link
Collaborator

I also like the third option. It's the approach I am taking in Pyter and so far it's working well there.

@AFg6K7h4fhy2
Copy link
Collaborator

Asked Dylan this but thought to include here as well:

I've operated somewhat off the notion that EpiData(ABC) should be able to support sample objects (e.g., HospModelSample) but want to possibly cease pursuing this path if EpiData(ABC) is to be designed exclusively for artificial / observed data such as simulated latent infections / admissions, WW data, or variants of NHSN/NSSP like dataframes, which is what I've worked towards. Confirmation here would be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants