Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wragged output methods #1405

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open

Wragged output methods #1405

wants to merge 18 commits into from

Conversation

ben18785
Copy link
Collaborator

Implements SubProblem and ProblemCollection methods which collectively allow:

  • problems to be constructed where different outputs are measured at different times
  • separate subproblems to be created for different subsets of outputs (and, accordingly, for inference, given different score functions or log-likelihoods)
  • caching of simulation outputs between evaluations for the same parameter set

@codecov
Copy link

codecov bot commented Oct 20, 2021

Codecov Report

Merging #1405 (d68edbb) into master (d64eb05) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##            master     #1405    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files           87        87            
  Lines         8889      9018   +129     
==========================================
+ Hits          8889      9018   +129     
Impacted Files Coverage Δ
pints/__init__.py 100.00% <100.00%> (ø)
pints/_core.py 100.00% <100.00%> (ø)
pints/_optimisers/_pso.py 100.00% <0.00%> (ø)
pints/_optimisers/_snes.py 100.00% <0.00%> (ø)
pints/_optimisers/_xnes.py 100.00% <0.00%> (ø)
pints/_optimisers/_cmaes.py 100.00% <0.00%> (ø)
pints/_optimisers/__init__.py 100.00% <0.00%> (ø)
pints/_optimisers/_cmaes_bare.py 100.00% <0.00%> (ø)
pints/_optimisers/_nelder_mead.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d64eb05...d68edbb. Read the comment docs.

@ben18785
Copy link
Collaborator Author

@MichaelClerx would you be able to take a look here if you get a moment?

@MichaelClerx
Copy link
Member

I'll try to look before too long. Are there no PKPD modellers who could have a look in the meantime?

@ben18785
Copy link
Collaborator Author

Thanks! @martinjrobins -- if you had a spare few mins, it would be good to see what you thought about this.

Copy link
Member

@MichaelClerx MichaelClerx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ben,

It looks quite good, but I'm a bit confused that the culmination of all of this is a custom-made likelihood class. I've thought about it a bit, and can't see any easy way out other than adding a Composed/Combined/Composite Likelihood, Prior, and Error in the same PR?

We've got a ComposedLogPrior already: https://github.com/pints-team/pints/blob/master/pints/_log_priors.py#L170

Pints provides the :class:`SingleOutputProblem` and
:class:`MultiOutputProblem` classes to formulate
Pints provides the :class:`SingleOutputProblem`,
:class:`MultiOutputProblem`, :class:`ProblemCollection`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker: But can we think of some names that make it clear that:

  1. A ProblemCollection can't just contain any old problem. We can't add singleoutput problems, for example
  2. The ProblemCollection and SubProblem go together

I'm not sure what these names would be :D
MultipartProblem and ProblemPart ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe get the word "output" into the name for ProblemCollection, to show that all 3 problem types are actually classified by their output. So SingleOutput, MultiOutput, and... SplitOutputProblem? ComposedOutputProblem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the user never creates a SubProblem, (and since it is a very different thing from the other 3 classes called XProblem), is this a rare case for having a class inside a class?
E.g.

class ProblemCollection(object):
    ...
    
    class SubProblem(object):
        ...
        
    def subproblem(i):
        return ProblemCollection.SubProblem(...)

@@ -99,6 +99,7 @@ relevant code.
- [Multiplicative Gaussian noise](./stats/multiplicative-gaussian-errors.ipynb)
- [Pooling parameters](./stats/pooling.ipynb)
- [Student-t noise model](./stats/student-t-sampling-error.ipynb)
- [Wragged output model](./stats/multiple_wragged_outputs.ipynb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never come across the word "wragged" before. Is it a technical term?

"source": [
"# Multi-output model with wragged time measurements\n",
"\n",
"This notebook describes how to fit a model in PINTS which has multiple outputs with different measurement times for each. Additionally, we illustrate how this same framework allows different measurement models to be used for different outputs within the same forward model when doing inference.\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too quick for me.
Can we change it to say "This notebook describes... for each. This situation arises when..." 1-3 lines on why this would occur, e.g. when you measure different system outputs individually.
Then leave the "additionally" for later, just add in a header and give a 2nd intro?

"cell_type": "markdown",
"metadata": {},
"source": [
"Now we assume that only the first half of time measurements are taken for the first two outputs; and only the second half for the third."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this a bit less artificial? I.e. start with wordy version, saying e.g. we measure X every 6 hours, but Y only once a day. Then show how that leads to code like given below?

"# solve system\n",
"sol_12 = problem_0.evaluate(real_parameters)\n",
"\n",
"## this uses cached result from first solve under the hood\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"## this uses cached result from first solve under the hood\n",
"# this uses cached result from first solve under the hood\n",

},
"outputs": [],
"source": [
"class CombinedLogLikelihood(pints.LogPDF):\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the kind of thing I'd want PINTS to handle for me, after I've gone to the trouble of setting up the special problem class!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants