Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implement MlflowClient.log_model #11906

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

rachthree
Copy link

@rachthree rachthree commented May 4, 2024

🛠 DevTools 🛠

Open in GitHub Codespaces

Install mlflow from this PR

pip install git+https://github.com/mlflow/mlflow.git@refs/pull/11906/merge

Checkout with GitHub CLI

gh pr checkout 11906

Related Issues/PRs

Resolve #7392

What changes are proposed in this pull request?

This adds the log_model method to MlflowClient as requested in the above issue.

I believe this PR is the path of least resistance to provide this API, but this is my first PR to the mlflow repo, so there may be a better path I may have missed. Tests will be added pending discussion of this implementation. I hope there is a better path because this implementation has caveats that I think are not clean:

  • Due to mlflow's default reliance on global tracking and registry URIs, internal functions are updated to accept an optional MlflowClient and to use its tracking URI when provided.
    • mlflow.tracking._tracking_service.utils._resolve_tracking_uri updated to provide the client's tracking URI should the client be provided. The hierarchy is now a specified tracking URI, the client's tracking URI, then the global tracking URI.
    • mlflow.store.artifact.runs_artifact_repo.RunsArtifactRepository's get_underlying_uri updated to use the client's tracking URI should the client be provided.
    • The usage of Model.log in flavors by default use the global URIs. To use the client's URIs, it must be provided as a kwarg.
      • For now, I've only updated the PyTorch, ONNX, and PyFunc flavors. If this approach is the way forward, I will update all flavors.
      • How do we communicate this to those implementing their own custom flavor?
  • Many of the fluent APIs create their own instance of MlflowClient, using the global defaults. This PR updates the ones that are needed to log the model, using the provided client when applicable.
    • Updating the other fluent APIs to use a provided client may be out of scope of this PR. I can create a followup PR if needed, but if you want me to include it in this PR, then I will.

Possible follow-ups:

  • For this PR if approach has consensus: Update all flavors to accept client for log_model.
  • Update all fluent APIs to use a provided client when possible.
  • I think naturally the next API to implement would be MlflowClient.load_model.

How is this PR tested?

  • Existing unit/integration tests - Will test pending discussion
  • New unit/integration tests - Will create pending discussion
  • Manual tests

With PostgreSQL and MLflow servers deployed as Docker containers, ran:

import mlflow
from mlflow import MlflowClient
from pathlib import Path

from torchvision.models import resnet50

model = resnet50(pretrained=True)
model.cpu()
model.eval()

server_uri = "http://localhost:5000"
experiment_name = "r50"
local_data_dir = str(Path("./local_runs").expanduser().resolve())

local_client = MlflowClient()
server_client = MlflowClient(tracking_uri=server_uri, registry_uri=server_uri)

print("Registering using local client...")
local_exp_info = local_client.get_experiment_by_name(experiment_name)
if local_exp_info:
    local_exp_id = local_exp_info.experiment_id
else:
    local_exp_id = local_client.create_experiment(experiment_name)
local_run_info = local_client.create_run(local_exp_id)
local_run_id = local_run_info.info.run_id
local_client.log_model(local_run_id, model, "torch-model", mlflow.pytorch, registered_model_name="r50-torch")

print("Registering using server client...")
server_exp_info = server_client.get_experiment_by_name(experiment_name)
if server_exp_info:
    server_exp_id = server_exp_info.experiment_id
else:
    server_exp_id = server_client.create_experiment(experiment_name)
server_run_info = server_client.create_run(server_exp_id)
server_run_id = server_run_info.info.run_id
server_client.log_model(server_run_id, model, "torch-model", mlflow.pytorch, registered_model_name="r50-torch")

Results:
Client for local file registration behavior has not changed, showing usage of default local filestore has been maintained.

MLflow UI shows successful logging of the model (note: I ran the above example twice, showing two versions of the model in the server below):

image
image

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions
  • Maybe? This adds further requirements when using Model.log to ensure we use the client's tracking / registry URI.

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

MfllowClient.log_model is added, which users can now use should they want to use the client instead of setting the global tracking and registry URIs to log models.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/deployments: MLflow Deployments client APIs, server, and third-party Deployments integrations
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?
  • Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
    Bug fixes, doc updates and new features usually go into minor releases.
  • Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
    Bug fixes and doc updates usually go into patch releases.
  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

Signed-off-by: Craig Chan <46288912+rachthree@users.noreply.github.com>
@github-actions github-actions bot added area/artifacts Artifact stores and artifact logging area/model-registry Model registry, model registry APIs, and the fluent client calls for model registry area/models MLmodel format, model serialization/deserialization, flavors area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs. labels May 4, 2024
Copy link

github-actions bot commented May 4, 2024

Documentation preview for 96ce6cc will be available when this CircleCI job
completes successfully.

More info

Signed-off-by: Craig Chan <46288912+rachthree@users.noreply.github.com>
@rachthree
Copy link
Author

@WeichenXu123 @harupy @dbczumar tagging you here since you commented on the linked issue... not sure why I can't add you as reviewers. How does this implementation look? Thank you in advance!

Signed-off-by: Craig Chan <46288912+rachthree@users.noreply.github.com>
Signed-off-by: Craig Chan <46288912+rachthree@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts Artifact stores and artifact logging area/model-registry Model registry, model registry APIs, and the fluent client calls for model registry area/models MLmodel format, model serialization/deserialization, flavors area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FR] Why is there no 'log_model()' function in mlflow.client?
1 participant