Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiConfigTask #391

Open
mafrahm opened this issue Feb 2, 2024 · 0 comments
Open

MultiConfigTask #391

mafrahm opened this issue Feb 2, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@mafrahm
Copy link
Contributor

mafrahm commented Feb 2, 2024

There are many tasks that might require to access columns/histograms from multiple campaigns simultaneously.

  • Machine Learning
  • creating datacards
  • rebinning
  • plotting (e.g. combining 2022preEE and 2022postEE)

There is already the MLModelTrainingMixin [1] that allows us to train DNNs with multiple campaigns. However, this mixin cannot be reused for other types of MultiConfigTasks.
Instead we should build one central MultiConfigTask base task.

The MLModelTrainingMixin currently resolves Calibrators, Selector and Producers (CSPs) for each year individually. Therefore, we cannot rely on the already defined mixins for CSPs but have to reimplement the resolving of each of those.
This could be simplified by removing the resolving for each year individually, since users can simply write their CSPs such that they are year-independent. Such a Producer might look like this:

@producer
def producer_run2(self: Producer, events: ak.Array, **kwargs) -> ak.Array:
    year = self.config_inst.campaign.x.year

    if year == 2016:
        events = self[producer_2016](events, **kwargs)
    elif year == 2017:
        events = self[producer_2017](events, **kwargs)
    elif year == 2018:
        events = self[producer_2018](events, **kwargs)


def producer_run2_init(self: Producer) -> None:
    year = self.config_inst.campaign.x.year

    if year == 2016:
        self.uses.add(producer_2016)
        self.produces.add(producer_2016)
    elif year == 2017:
        self.uses.add(producer_2017)
        self.produces.add(producer_2017)
    elif year == 2018:
        self.uses.add(producer_2018)
        self.produces.add(producer_2018)

The main changes that are required in our current configs/tasks is that our mixins for CSPs + ML need to inherit from the AnalysisTask instead of the ConfigTask and that the default CSPs need to be defined in the analysis_inst instead of the config_inst

@riga

[1]

class MLModelTrainingMixin(MLModelMixinBase):

@mafrahm mafrahm added the enhancement New feature or request label Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant