Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define API for Release 1.0 (and introduction to issues) #18

Open
jia200x opened this issue Apr 4, 2020 · 2 comments
Open

Define API for Release 1.0 (and introduction to issues) #18

jia200x opened this issue Apr 4, 2020 · 2 comments
Assignees
Milestone

Comments

@jia200x
Copy link
Contributor

jia200x commented Apr 4, 2020

This issue serves as an introduction to GH issues and an entry point to discuss the API of the model.

The model class

Since #17 got merged, the model class (and children) has now the following public API

Model.set_params(self, p, initial_conds):
        """ Set model parameters.
        input:
        p: parameters of the model. The parameters units are 1/day.
        initial_conds: Initial conditions, in total number of individuals.
        For instance, S0 = n_S0/population, where n_S0 is the number of subjects
        who are susceptible to the disease.
        """

Model.export(self, f, delimiter=","):
        """ Export the output of the model in CSV format
        Calling this before solve() raises an exception.
        input:
        f: file name or descriptor
        delimiter: delimiter of the CSV file
        """

Model.fetch(self):
        """ Fetch the data from the model.
        The first row is the time in days
        """

Model.solve(self, tf_days=DAYS, numpoints=NUMPOINTS):
        """ Solve using children class model.
        input:
        tf_days: number of days to simulate
        numpoints: number of points for the simulation.
        output:
        Reference to self
        """

 Model.r0
        """ Returns reproduction number
        r0 = alpha/beta"""

Model.fit(self, t_obs, n_i_obs, population, fit_index=None):
        """ Use the Levenberg-Marquardt algorithm to fit
        the parameter alpha, as beta is assumed constant
        inputs:
        t_obs: Vector of days corresponding to the observations of number of infected people
        n_i_obs: Vector of number of infected people
        population: Size of the objective population
        Return
        """

Note solve, set_params and fit return the model object (so it's possible to compose the functions like `Model.fit().solve().export()). All the rest return values.

As mentioned in #17, there are some new post regression functions that can be included in the class model (analogue to export or fetch).

Also as proposed by @felipehuerta17 , we could add a plot function.

So, concrete questions:

  1. Is the current API ok?
  2. How do we integrate the post regression functions here? Same API?
  3. How does this plot function would look like? What kind of arguments?
@jia200x jia200x added this to the Release 1.0 milestone Apr 4, 2020
@felipehuerta17
Copy link
Contributor

  1. I like the API as general structure. The functions need to be improved to allow more flexible use.

  2. I think the API as we are mimicking well known libraries such as scikit-learn. Whenever we make confidence intervals a member function, we should include a boolean attribute like "is_fitted", as the confidence intervals can only be calculated after the model was fitted.

In this sense, storing the training data on the model object would be very useful to simplify how to call confidence intervals and other statistical methods, as it will require only the parameters of the stat test instead of model.fit train [t,I] data.

2.1 It's also possible to fit with the recovered population (R). I will think on a more abstract but yet intuitive way for the user to consider R as a separate fit variable, in principle. I will open another issue with the discussion of fitting over two different sets of data, as that will require a more complicated cost function but will be novel.

2.2 Post_regression routines are likely to be one of the most complicated parts of the code and I think it will be great to mantain them in a different file, but they can set attributes of the model. For example, after doing in the future model.ci_block_cv() it would be great to store the confidence intervals in the model object.

  1. Let's think on the typical plots
  • plot(type = "fit") is essential so the reader can visualize the fitting after performing it.
  • plot(type = "predict", t = "n_days", ci_type) may be useful for predicting 5 days and plotting confidence intervals in a nice, and automatic way.

Ideally the user would be fetching model results and doing this, but for exploration and debugging I think these two function will be incredibly useful.

@felipehuerta17 felipehuerta17 self-assigned this Apr 7, 2020
@jia200x
Copy link
Contributor Author

jia200x commented Apr 7, 2020

I think the API as we are mimicking well known libraries such as scikit-learn. Whenever we make confidence intervals a member function, we should include a boolean attribute like "is_fitted", as the confidence intervals can only be calculated after the model was fitted.

The export and fetch functions are already doing that. We can mimic this behavior

In this sense, storing the training data on the model object would be very useful to simplify how to call confidence intervals and other statistical methods, as it will require only the parameters of the stat test instead of model.fit train [t,I] data.

I'm in for this!

2.2 Post_regression routines are likely to be one of the most complicated parts of the code and I think it will be great to mantain them in a different file, but they can set attributes of the model. For example, after doing in the future model.ci_block_cv() it would be great to store the confidence intervals in the model object.

I agree they should be in a separate file. We can expose them via "mixins"

Let's think on the typical plots
plot(type = "fit") is essential so the reader can visualize the fitting after performing it.
plot(type = "predict", t = "n_days", ci_type) may be useful for predicting 5 days and plotting confidence intervals in a nice, and automatic way.

Ideally the user would be fetching model results and doing this, but for exploration and debugging I think these two function will be incredibly useful.

This also sounds reasonable to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants