[FR] Why is there no 'log_model()' function in mlflow.client? #7392

jaehyeongAN · 2022-11-24T02:02:13Z

Willingness to contribute

Yes. I would be willing to contribute this feature with guidance from the MLflow community.

Proposal Summary

I want to use mlflow by parallel runs. But, mlflow is not thread-safe. when using mlflow.start_run() code, run id becomes a global variable. So, every parallel runs become crashed.

I found a method of log_artifact() in mlflow.client(). This method is not log model and meta info of model to mlflow tracking server, but only save to registry server.

I want to method for mlflow.client like mlflow.skelarn.log_model() that is save model to registry and log to mlflow tracking server.

Is it possible?

Motivation

What is the use case for this feature?

When using mlflow.client, There is no way to log model to mlflow tracking server. log_artifact() method in mlflow.client is only save to registry not log.

Why is this use case valuable to support for MLflow users in general?

This feature is necessary to users who want to use mlflow in parallel.

Why is this use case valuable to support for your project(s) or organization?

Why is it currently difficult to achieve this use case?

Now I',m using mlflow.onnx.log_model(). So parallel runs become crashed.

Details

No response

What component(s) does this bug affect?

What interface(s) does this bug affect?

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

What language(s) does this bug affect?

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

What integration(s) does this bug affect?

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

The text was updated successfully, but these errors were encountered:

WeichenXu123 · 2022-11-24T11:12:10Z

I think we can add a method MLflowClient.log_model, like:

class MLflowClient:
  def log_model(
        run_id,
        artifact_path,
        flavor,
        registered_model_name=None,
        await_registration_for=DEFAULT_AWAIT_MAX_SLEEP_SECONDS,
        **kwargs,
  )

@dbczumar @harupy @BenWilson2 WDYT ?

harupy · 2022-11-25T01:21:21Z

Can we just add run_id (optional arguments that defaults to None) to mlflow.<flavor>.log_model?

jaehyeongAN · 2022-11-25T02:09:58Z

@WeichenXu123
As I tested, after adding log_model() function to mlflow.client, like this.

# This function referd from mlflow.models.model.log()
@classmethod
def log(
    cls,
    run_id,
    local_path,
    artifact_path,
    flavor,
    registered_model_name=None,
    await_registration_for=DEFAULT_AWAIT_MAX_SLEEP_SECONDS,
    **kwargs,
):
    with TempDir() as tmp:
        mlflow_model = cls(artifact_path=artifact_path, run_id=run_id)
        flavor.save_model(path=local_path, mlflow_model=mlflow_model, **kwargs)
        mlflow.tracking.fluent.log_artifacts(local_path, mlflow_model.artifact_path)
        try:
            mlflow.tracking.fluent._record_logged_model(mlflow_model)
        except MlflowException:
            # We need to swallow all mlflow exceptions to maintain backwards compatibility with
            # older tracking servers. Only print out a warning for now.
            _logger.warning(_LOG_MODEL_METADATA_WARNING_TEMPLATE, mlflow.get_artifact_uri())
            _logger.debug("", exc_info=True)
        if registered_model_name is not None:
            # run_id = mlflow.tracking.fluent.active_run().info.run_id
            mlflow.register_model(
                "runs:/%s/%s" % (run_id, mlflow_model.artifact_path),
                registered_model_name,
                await_registration_for=await_registration_for,
            )
    return mlflow_model.get_model_info()

Also, exceptions were handled for type check errors that occurred in the mlflow.tracking.fluent._record_logged_model(mlflow_model) code.

# This function refered from mlflow.tracking.cllient._record_logged_model()
def _record_logged_model(self, run_id, mlflow_model):
    from mlflow.client import MlflowClient

    if not isinstance(mlflow_model, MlflowClient):
        raise TypeError(
            "Argument 'mlflow_model' should be of type mlflow.models.Model but was "
            "{}".format(type(mlflow_model))
        )
    self.store.record_logged_model(run_id, mlflow_model)

I tested it and the above settings work.

WeichenXu123 · 2022-11-25T07:40:08Z

@harupy

Can we just add run_id (optional arguments that defaults to None) to mlflow.<flavor>.log_model?

This is a bit of using MLflowClient, because MlflowClient might use another tracking_uri and registry_uri that are different with the globally set tracking_uri / registry_uri

@jaehyeongAN
Your testing code uses classmethod, so, it has similar issue , it cannot use the correct tracking_uri / registry_uri that are set in a specific MlflowClient instance.

harupy · 2022-11-28T14:07:50Z

@WeichenXu123 Got it.

I think we can add a method MLflowClient.log_model, like:

This approach sounds good to me.

dbczumar · 2022-11-28T23:07:22Z

Sounds good to me as well!

WeichenXu123 · 2022-11-29T08:48:37Z

@jaehyeongAN

We get consensus to add API like:

class MLflowClient:
  def log_model(
        run_id,
        artifact_path,
        flavor,
        registered_model_name=None,
        await_registration_for=DEFAULT_AWAIT_MAX_SLEEP_SECONDS,
        **kwargs,
  )

Would you contribute this new API ?

Thank you!

jaehyeongAN · 2022-11-29T09:31:52Z

@WeichenXu123 Cool, I want to do that.

mlflow-automation · 2022-12-02T00:13:27Z

@BenWilson2 @dbczumar @harupy @WeichenXu123 Please assign a maintainer and start triaging this issue.

rachthree · 2023-06-08T14:04:04Z

@jaehyeongAN @WeichenXu123 any updates on this? This would be really nice to have :)

WeichenXu123 · 2023-06-20T04:33:13Z

@jaehyeongAN I assigned the task to you. :)

rachthree · 2023-08-01T13:44:26Z

@jaehyeongAN @WeichenXu123 I understand when the plate gets full in life and work. I can try to take this on as it's applicable to my work if that's okay with y'all. I just may need some guidance as it would be my first time contributing to MLflow.

On a related note, not only is run_id global but IIRC register_model would use the global tracking and registry URIs. I remember a couple of months ago needing to use mlflow.set_tracking_uri and mlflow.set_registry_uri with MLflow 2.4.0. Is this still the case and something we'd have to take into account when implementing MLflowClient.log_model?

rachthree · 2024-05-04T18:00:55Z

I had totally forgot about this issue until recently... Since there has been no activity since, I went ahead and created a WIP PR here: #11906

rachthree · 2024-05-16T13:41:04Z

@WeichenXu123 following up on the above WIP PR that may warrant discussion before I proceed further. Thanks!

jaehyeongAN added the enhancement New feature or request label Nov 24, 2022

github-actions bot added area/artifacts Artifact stores and artifact logging area/model-registry Model registry, model registry APIs, and the fluent client calls for model registry area/models MLmodel format, model serialization/deserialization, flavors labels Nov 24, 2022

WeichenXu123 self-assigned this Dec 2, 2022

WeichenXu123 assigned jaehyeongAN Jun 20, 2023

rachthree linked a pull request May 4, 2024 that will close this issue

[WIP] Implement MlflowClient.log_model #11906

Open

40 tasks

github-actions bot added the has-closing-pr This issue has a closing PR label May 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] Why is there no 'log_model()' function in mlflow.client? #7392

[FR] Why is there no 'log_model()' function in mlflow.client? #7392

jaehyeongAN commented Nov 24, 2022 •

edited

What is the use case for this feature?

Why is this use case valuable to support for MLflow users in general?

Why is this use case valuable to support for your project(s) or organization?

Why is it currently difficult to achieve this use case?

WeichenXu123 commented Nov 24, 2022 •

edited

harupy commented Nov 25, 2022 •

edited

jaehyeongAN commented Nov 25, 2022 •

edited

WeichenXu123 commented Nov 25, 2022 •

edited

harupy commented Nov 28, 2022

dbczumar commented Nov 28, 2022

WeichenXu123 commented Nov 29, 2022

jaehyeongAN commented Nov 29, 2022

mlflow-automation commented Dec 2, 2022

rachthree commented Jun 8, 2023

WeichenXu123 commented Jun 20, 2023

rachthree commented Aug 1, 2023

rachthree commented May 4, 2024

rachthree commented May 16, 2024

[FR] Why is there no 'log_model()' function in mlflow.client? #7392

[FR] Why is there no 'log_model()' function in mlflow.client? #7392

Comments

jaehyeongAN commented Nov 24, 2022 • edited

Willingness to contribute

Proposal Summary

Motivation

What is the use case for this feature?

Why is this use case valuable to support for MLflow users in general?

Why is this use case valuable to support for your project(s) or organization?

Why is it currently difficult to achieve this use case?

Details

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

WeichenXu123 commented Nov 24, 2022 • edited

harupy commented Nov 25, 2022 • edited

jaehyeongAN commented Nov 25, 2022 • edited

WeichenXu123 commented Nov 25, 2022 • edited

harupy commented Nov 28, 2022

dbczumar commented Nov 28, 2022

WeichenXu123 commented Nov 29, 2022

jaehyeongAN commented Nov 29, 2022

mlflow-automation commented Dec 2, 2022

rachthree commented Jun 8, 2023

WeichenXu123 commented Jun 20, 2023

rachthree commented Aug 1, 2023

rachthree commented May 4, 2024

rachthree commented May 16, 2024

jaehyeongAN commented Nov 24, 2022 •

edited

WeichenXu123 commented Nov 24, 2022 •

edited

harupy commented Nov 25, 2022 •

edited

jaehyeongAN commented Nov 25, 2022 •

edited

WeichenXu123 commented Nov 25, 2022 •

edited