Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict parameter list in kubeflow experiment run to one pipeline only #92

Open
adrian-dembek opened this issue Jan 14, 2022 · 2 comments

Comments

@adrian-dembek
Copy link

When running an experiment with a pipeline, in Kubeflow/Runs/[Graph] section I can see only the selected pipeline name (in this case "scoring"), but when going to [Config] - Run parameters section contains parameters from all pipelines - it is miselading.

@DmitriyLamzin
Copy link
Contributor

Agree. The kubeflow plugin attaches all available parameters in a config dir to a single pipeline, even if exactly this pipeline doesn't use them.

It applies some restrictions. For example, I reached the params size limit because of this:
The pipeline spec is invalid.: Invalid input error: The input parameter length exceed maximum size of 10000.

@edhenry
Copy link

edhenry commented Oct 23, 2023

I ran into this issue and modified my pipelines and the plugin to accommodate. Below is a summary of what I've done.

  1. I ported my pipelines to use modular_pipelines.
  2. Modified the corresponding parameter configurations
  3. Added a conditional to the pipeline generation process here to account for a pipeline name passed through the command line argument kedro kubeflow compile --pipeline <pipeline_name> -o <pipeline_name>.yml.

The modifications to the per_node_pipeline_generator.py module I made the following:

    def generate_pipeline(self, pipeline, image, image_pull_policy):
        if pipeline == "__default__":
            merged_params = merge_namespaced_params_to_dict(self.context.params)
        else:
            merged_params = merge_namespaced_params_to_dict(self.context.params[pipeline])

This will check for and use parameters specific to the pipeline passed through the command line argument and generate a pipeline manifest accordingly.

Without specifying a pipeline name, you're left with the same Invalid input error: The input parameter length exceed maximum size of 10000 as all parameters are naively dumped. However, this allows me to sidestep that problem (for now) and generate pipeline-specific manifests that I can deploy accordingly.

Full disclosure: I have about a dozen pipelines, modular and not modular, in this project I'm working with, so I will try and address the "dump all parameters" approach in the coming days/weeks and add my findings back to this thread. 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants