[FR] Get or create run #11783

BalkanFlink · 2024-04-22T16:29:15Z

Willingness to contribute

Yes. I can contribute this feature independently.

Proposal Summary

It would be convenient to our data scientists if there was one function that could search for an experiment's run by name and return the relevant run object if it exists, or create it if it does not exist. If there exist more than one run with that run name in the experiment, it would return an error.

Motivation

What is the use case for this feature?

Users would like to pick up where they left off with a run. This would be easier and quicker to do with the run name rather than the run id

Why is this use case valuable to support for MLflow users in general?

It would save them having to search for the run id in the MLflow UI before they can obtain the run via Python API and resume their work.

Why is this use case valuable to support for your project(s) or organization?

Making the lives of Data Scientists easier by removing a step from their workflow

Why is it currently difficult to achieve this use case?

Resuming a specific run currently requires knowing it's run id (via the UI), whereas it would be a smoother experience to just search/create by run name.

Details

I'm happy to contribute this feature. I would add a method to the mlflow client to first search for an existing run with the same run name (using mlflow.search_runs(filter_string="run_name='myexistingrun'") ) or create a new run with that run name if it does not exist. If there is more than one run with this name, it would throw an error.

What component(s) does this bug affect?

What interface(s) does this bug affect?

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

What language(s) does this bug affect?

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

What integration(s) does this bug affect?

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

The text was updated successfully, but these errors were encountered:

daniellok-db · 2024-04-22T23:13:46Z

Hi @BalkanFlink, i think the request makes sense. With regard to searching runs by name, there is already some support for this via the mlflow.search_runs() api. The syntax is not as convenient as just having run_name, but you can use it like this:

mlflow.search_runs(
  experiment_ids=[0], 
  # if you know the exact run name
  filter_string="attributes.run_name='shivering-fox-792'"
)

mlflow.search_runs(
  experiment_ids=[0], 
  # if you only know part of the run name
  filter_string="attributes.run_name LIKE '%fox%'"
)

depending on the result of the search, you can either create a run or retrieve the run id from the search result. let me know if this solves your use case!

BalkanFlink · 2024-04-23T12:49:48Z

Hi @daniellok-db , thanks for the info. I know it is possible to do already, but it would be convenient as a small standalone function in my opinion. I will already be developing this function as part of a ticket, the decision now is whether I can contribute it to MLflow directly instead of building our own internal wrapper function. Should I fork the repo and raise a PR related to this issue (#11783) ?

daniellok-db · 2024-04-24T00:19:38Z

I see! Yes, feel free to file a PR and the MLflow team will review it 😄

github-actions · 2024-04-30T00:12:30Z

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

BalkanFlink · 2024-05-07T15:46:19Z

hey @daniellok-db , would you be able to take a look at our PR for this please?

BalkanFlink added the enhancement New feature or request label Apr 22, 2024

github-actions bot added the area/tracking Tracking service, tracking client APIs, autologging label Apr 22, 2024

m-blasiak linked a pull request May 3, 2024 that will close this issue

Add start run by name #11896

Open

39 tasks

github-actions bot added the has-closing-pr This issue has a closing PR label May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] Get or create run #11783

[FR] Get or create run #11783

BalkanFlink commented Apr 22, 2024

What is the use case for this feature?

Why is this use case valuable to support for MLflow users in general?

Why is this use case valuable to support for your project(s) or organization?

Why is it currently difficult to achieve this use case?

daniellok-db commented Apr 22, 2024

BalkanFlink commented Apr 23, 2024

daniellok-db commented Apr 24, 2024

github-actions bot commented Apr 30, 2024

BalkanFlink commented May 7, 2024

[FR] Get or create run #11783

[FR] Get or create run #11783

Comments

BalkanFlink commented Apr 22, 2024

Willingness to contribute

Proposal Summary

Motivation

What is the use case for this feature?

Why is this use case valuable to support for MLflow users in general?

Why is this use case valuable to support for your project(s) or organization?

Why is it currently difficult to achieve this use case?

Details

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

daniellok-db commented Apr 22, 2024

BalkanFlink commented Apr 23, 2024

daniellok-db commented Apr 24, 2024

github-actions bot commented Apr 30, 2024

BalkanFlink commented May 7, 2024