New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add 'modify' module type #392
Comments
It's late, so I'll just garble some words and hope they make vague sense...
3a) Stacked SDMs.
Type thing. I believe this keeps the whole analysis reproducible in a way that anything takes workflow objects as arguments wouldn't. 3b) Ensemble.
|
3a) 3b) model = ensemble(models = list(MaxEnt,
LogisticRegression,
MachineLearn("something"),
MachineLearn("something_else")),
weighting = "AUC") (where One downside of this is that it would be hard to investigate each of the component models, by passing them to output. I don't think zoon would record the module versions either. Though neither of these are critical. P.S. no one I know is modelling the new malaria. Could re-use Freya and Catherine's knowlesi reservoir maps though? |
There's interest in being able to do ensemble SDMs and stacked SDMs in zoon. We've also run into some awkwardness with thresholding and MESS masks, which need to be applied to rasters either before or after modelling.
In the past we've briefly thought about changing the core setup to enable things like ensemble models, but haven't settled on a way of integrating it into zoon's interoperable module types.
We've just had a little brainstorming session here, and come up with something that might work well within what zoon already does. I'd be keen to hear your thoughts.
We could add an additional module type
modify
(name up for discussion) between themodel
andoutput
steps.modify
would take as input a list ofZoonModel
objects (returned by one or moremodel
modules) and return a list ofZoonModel
objects, of the same or different length. TheZoonModel
objects would then be pulled out of the list and passed to theoutput
modules.In the default case (i.e. a 'noModify'
modify
module could be used by default, for backwards compatibility), the input and output lists would be the same, so the workflow would run as it currently does. E.g.:(three outputs, one per model,
modify
has a 'noModify' as a default argument so need not be specified)If the user provided a
modify
module like 'threshold', that module would return a list ofZoonModel
object, with prediction methods modified to predict 1 above the threshold or 0 below. This could be handled by nesting oneZoonModel
inside another, or by adding a new decorator function. These could be chained to do multiple things. E.g.:(three outputs, one per model, with predictions set to 0 or 1 and clamped to the extreme values of the observed data)
If the user provided a
modify
module like 'ensemble', that module would return a list of only oneZoonModel
object, making predictions from the the ensemble. E.g.(one output, for an ensemble model making averaged predictions)
Similarly a 'stack'
modify
module would return a list with a singleZoonModel
object to predict the number of species (like an abundance model). Users could listmodify
modules if they wanted, to return both the original models, and the ensemble models:This would take a little work, but not too much. It would also make zoon a more attractive prospect to the ensemblers, and richness modellers.
What do you reckon?
The text was updated successfully, but these errors were encountered: