Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unifying pipestat config with pipestat constructor #206

Open
nsheff opened this issue Feb 15, 2024 · 2 comments
Open

unifying pipestat config with pipestat constructor #206

nsheff opened this issue Feb 15, 2024 · 2 comments
Assignees

Comments

@nsheff
Copy link
Member

nsheff commented Feb 15, 2024

Right now, the docs suggest configuring pipestat via pypiper like this:

pm = pypiper.PipelineManager(
  ...,
  pipestat_schema="custom_results_schema.yaml",
  pipestat_results_file="custom_results_file.yaml",
  pipestat_sample_name="my_record",
  pipestat_project_name="my_namespace",
  pipestat_config="custom_pipestat_config.yaml",
) 

meanwhile, pipestat is configured like this:

psm = pipestat.PipestatManager(
    record_identifier="sample1",
    results_file_path=temp_file,
    schema_path="../tests/data/sample_output_schema.yaml",
)

I would like these to be uniform. So, I want to do:

pipestat_config = {
    "record_identifier":sample["sample_name"],
    "schema_path":"pipeline/output_schema.yaml",
    "results_file_path":"results.yaml",
    "pipeline_type":"sample"
}

And use this for either, like:

psm = pipesatat.PipestatManager(**pipestat_config)

or:

pm = pypiper.PipelineManager(
  ...,  #pypiper options
  pipestat_config=pipestat_config) 

This way, there's a single argument to PipelineManager, which accepts a dict of pipestat config options, which can be passed with **kwargs. This seems cleaner than specifying separate arguments, one for each pipestat config option. Also, it will ensure the options stay in sync -- right now they're out of sync (pypiper wants pipestat_sample_name, which it will pass to record_identifier). So, it will eliminate maintaining a bunch of pypiper argument names for consistency.

@nsheff
Copy link
Member Author

nsheff commented Feb 15, 2024

Another issue is that I can't figure out how to map the config options to configure pipestat the way I want it. I don't know what pipestat_project_name maps to, and I don't see how to set the pipeline_type through pypiper.

@nsheff
Copy link
Member Author

nsheff commented Feb 15, 2024

Just another example where this bit me again.

I wanted to pass multi_pipeilnes=True to pipestat, when I'm constructing my pypiper.PipelineManager, but this is not documented. The way to do it is to say multi=True to pypiper, which takes this and changes it to multi_pipelines=True passed to pipestat. I had to find this in the code itself to figure it out.

This would be easier and not require additional documentation if instead we used pipestat_config and passed through kwargs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants