Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Dynamic Service creation #4498

Open
wants to merge 29 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
8b167aa
Added add_api functionalty with example | draft
holzweber Jan 30, 2024
a28891d
working but unclean version of dynamic services
holzweber Feb 10, 2024
8299fd5
run pre-commit hook on changed files
holzweber Feb 10, 2024
fda3a5f
Missing checks
holzweber Feb 10, 2024
b346b6c
fix(sdk): current directory for built bentos (#4505)
bojiang Feb 19, 2024
1c52c7b
chore(cloud cli): rename cluster to region (#4508)
bojiang Feb 19, 2024
259383d
doc: Add the lcm lora use case doc (#4510)
Sherlock113 Feb 20, 2024
ad0d485
fix(sdk): clean bentoml version (#4511)
bojiang Feb 20, 2024
6c2ac38
fix: bug: Dataframes not serializing correctly in the new API (#4491)
frostming Feb 20, 2024
89adfb2
docs: Update the get started docs (#4513)
Sherlock113 Feb 20, 2024
da08bfa
fix(sdk): incorrect bento_path if not provided (#4514)
bojiang Feb 20, 2024
71c483f
docs: Add client code examples without context manager (#4512)
Sherlock113 Feb 21, 2024
2ed9108
docs: Update docs (#4515)
Sherlock113 Feb 21, 2024
b1557bf
docs: Add authorization docs (#4517)
Sherlock113 Feb 21, 2024
504ff63
docs: Change sample input to one line (#4518)
Sherlock113 Feb 21, 2024
b64ce64
docs: Update ControlNet use case docs (#4519)
Sherlock113 Feb 22, 2024
ed91f8a
docs: Update the distributed services and get started docs (#4521)
Sherlock113 Feb 22, 2024
7b0b0e6
refactor(cli): make CLI commands available as modules (#4487)
frostming Feb 23, 2024
69b8a29
docs: Refactor BentoCloud docs (#4525)
Sherlock113 Feb 23, 2024
b7169c7
Added add_api functionalty with example | draft
holzweber Jan 30, 2024
cc86bc5
working but unclean version of dynamic services
holzweber Feb 10, 2024
6fc21a1
run pre-commit hook on changed files
holzweber Feb 10, 2024
cc20f20
Missing checks
holzweber Feb 10, 2024
7f8a01f
Merge branch 'dynamic-service-creation' of https://github.com/holzweb…
holzweber Feb 25, 2024
0ed2ab5
added dynamic service
holzweber Mar 17, 2024
05fa5a1
ci: auto fixes from pre-commit.ci
pre-commit-ci[bot] Mar 17, 2024
3bc355d
added services with type()
holzweber Mar 18, 2024
c2fbcfc
fix merge issues
holzweber Mar 18, 2024
9a42ea0
ci: auto fixes from pre-commit.ci
pre-commit-ci[bot] Mar 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
44 changes: 44 additions & 0 deletions examples/dynamic_service/pipeline/README.md
@@ -0,0 +1,44 @@
# BentoML Sklearn Example: document classification pipeline

0. Install dependencies:

```bash
pip install -r ./requirements.txt
```

1. Train a document classification pipeline model

```bash
python ./train.py
```

2. Run the service:

```bash
bentoml serve service.py:svc
```

3. Send test request

Test the `/predict` endpoint:
```bash
curl -X POST -H "content-type: application/text" --data "hello world" http://127.0.0.1:3000/predict_model_0
```

Test the `/predict_proba` endpoint:
```bash
curl -X POST -H "content-type: application/text" --data "hello world" http://127.0.0.1:3000/predict_proba_model_0
```


4. Build Bento

```
bentoml build
```

5. Build docker image

```
bentoml containerize doc_classifier:latest
```
6 changes: 6 additions & 0 deletions examples/dynamic_service/pipeline/bentofile.yaml
@@ -0,0 +1,6 @@
service: "service.py:svc"
include:
- "service.py"
- "requirements.txt"
python:
requirements_txt: "./requirements.txt"
80 changes: 80 additions & 0 deletions examples/dynamic_service/pipeline/service.py
@@ -0,0 +1,80 @@
from typing import Any

import bentoml
from bentoml import Runner
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi holzweber. We are deprecating Runner API from 1.2. But I see the values of the dynamic servive example. Would you like update this once we finished the 1.2 DOC?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thus may I mark this as merge-hold?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bojiang i just checked the new documentation. but i do not know if my idea will work any longer, as i can not access the service object with the new service-annotation idea. i am scared that we really need to generate a pyhton file for dynamic services.. which is kind of an ugly solution...

any other idea how to get this working with 1.2?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dynamic service in 1.2 will still work, as OpenLLM has to do something similar like this.

You can probably hijack directly into the service object, since bentoml.Service will return the new service object, which treats all runnable as a normal python class.

I think one huge difference here is that the lifecycle is just a class, so probably a lot simpler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will send more example once I finish openllm revamp 😄

Copy link
Contributor Author

@holzweber holzweber Mar 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aarnphm ... Any more updates on this?
I was trying to do something like this, but it does not work as only the last services is in the service list afterwards (based on what i see in openapi)....

def wrap_service_methods(model: bentoml.Model,
                         targets: Any,
                         predict_route: str,
                         predict_name: str,
                         predict_proba_route: str,
                         predict_proba_name: str,
                         ):
    """Wrap models in service methods and annotate as api."""
    @bentoml.api(route=predict_route, name=predict_name)
    async def predict(input_doc: str):
        predictions = await model.predict.async_run([input_doc])
        return {"result": targets[predictions[0]]}

    @bentoml.api(route=predict_proba_route, name=predict_proba_name)
    async def predict_proba(input_doc: str):
        predictions = await model.predict_proba.async_run([input_doc])
        return predictions[0]


    return predict, predict_proba


@bentoml.service(
workers=1, resources={"cpu": "1"}
)
class DynamicService:
    def __init__(self):
        """Nothing to do here."""
        pass
    for idx, available_model in enumerate(bentoml.models.list()):
        if "twenty_news_group" in available_model.tag.name:
            print(f"Creating Endpoint {idx}")
            bento_model = bentoml.sklearn.get(f"{available_model.tag.name}:latest")
            print(bento_model)
            target_names = bento_model.custom_objects["target_names"]
            path_predict = f"predict_model_{idx}"
            path_predict_proba = f"predict_proba_model_{idx}"
            predict, predict_proba = wrap_service_methods(bento_model,
                                                          target_names,
                                                          predict_route=path_predict,
                                                          predict_name=path_predict,
                                                          predict_proba_route=path_predict_proba,
                                                          predict_proba_name=path_predict_proba,
                                                          )

Copy link
Contributor Author

@holzweber holzweber Mar 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: i added the methods via the locals() function. can you check and resolve conversation, if this is okey?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can probably use types.new_class here, or even type() to construct subclass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, i tried it with type() and it seems to work. updated my latest push.. so it is visible now :)

from bentoml.io import JSON
from bentoml.io import Text

"""The following example is based on the sklearn/pipeline example.

The concept revolves around dynamically constructing service endpoints:

Imagine you have n models ready for production.
When creating your Bento, you may not know in advance which models will be served.
Therefore, you create an endpoint for every available model that can be deployed.

Scenario: You trained hundreds of models.
While they are still in the training pipeline, you want to begin serving your first models already in production.

When constructing Bentos, you require a predefined service.py file. However, the number of endpoints is unknown
during construction of this file. You aim to reuse the same file each time you create a new Bento, without the need
to alter the service definitions repeatedly. Each model should ideally have a route with a unique running index,
for instance. """


def wrap_service_methods(runner: Runner, targets: Any):
"""Pass Runner and target names, as they are needed in both methods.

Note: Only passed arguments are available in the methods below, scope is not overwritten.
"""

async def predict(input_doc: str):
predictions = await runner.predict.async_run([input_doc])
return {"result": targets[predictions[0]]}

async def predict_proba(input_doc: str):
predictions = await runner.predict_proba.async_run([input_doc])
return predictions[0]

return predict, predict_proba


available_model_set = set()
# Add all unique variations of twenty_news_group to the service
for available_model in bentoml.models.list():
if "twenty_news_group" in available_model.tag.name:
available_model_set.add(available_model.tag.name)

model_runner_list: [Runner] = []
target_names: [] = []

for available_model in available_model_set:
bento_model = bentoml.sklearn.get(f"{available_model}:latest")
target_names.append(bento_model.custom_objects["target_names"])
model_runner_list.append(bento_model.to_runner())

svc = bentoml.Service("doc_classifier", runners=model_runner_list)

for idx, (model_runner, target_name) in enumerate(zip(model_runner_list, target_names)):
path_predict = f"predict_model_{idx}"
path_predict_proba = f"predict_proba_model_{idx}"
fn_pred, fn_pred_proba = wrap_service_methods(
runner=model_runner, targets=target_name
)

svc.add_api(
input=Text(),
output=JSON(),
user_defined_callback=fn_pred,
name=path_predict,
doc=None,
route=path_predict,
)
svc.add_api(
input=Text(),
output=JSON(),
user_defined_callback=fn_pred_proba,
name=path_predict_proba,
doc=None,
route=path_predict_proba,
)
118 changes: 118 additions & 0 deletions examples/dynamic_service/pipeline/train.py
@@ -0,0 +1,118 @@
import logging
from pprint import pprint
from time import time

from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline

import bentoml

# Display progress logs on stdout
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")

# Load some categories from the training set
categories = [
"alt.atheism",
"talk.religion.misc",
]

# Uncomment the following to do the analysis on all the categories
# categories = None

print("Loading 20 newsgroups dataset for categories:")
print(categories)

data = fetch_20newsgroups(subset="train", categories=categories)
print("%d documents" % len(data.filenames))
print("%d categories" % len(data.target_names))
print()

# Define a pipeline combining a text feature extractor with a simple classifier
pipeline = Pipeline(
[
("vect", CountVectorizer()),
("tfidf", TfidfTransformer()),
("clf", SGDClassifier(loss="log_loss")),
]
)

# Parameters to use for grid search. Uncommenting more parameters will give
# better exploring power but will increase processing time in a combinatorial
# way
parameters = {
"vect__max_df": (0.5, 0.75, 1.0),
# 'vect__max_features': (None, 5000, 10000, 50000),
"vect__ngram_range": ((1, 1), (1, 2)), # unigrams or bigrams
# 'tfidf__use_idf': (True, False),
# 'tfidf__norm': ('l1', 'l2'),
"clf__max_iter": (20,),
"clf__alpha": (0.00001, 0.000001),
"clf__penalty": ("l2", "elasticnet"),
# 'clf__max_iter': (10, 50, 80),
}

# Find the best parameters for both the feature extraction and the
# classifier
grid_search = GridSearchCV(pipeline, parameters, n_jobs=-1, verbose=1)

print("Performing grid search...")
print("pipeline:", [name for name, _ in pipeline.steps])
print("parameters:")
pprint(parameters)
t0 = time()
grid_search.fit(data.data, data.target)
print("done in %0.3fs" % (time() - t0))
print()

print("Best score: %0.3f" % grid_search.best_score_)
best_parameters = grid_search.best_estimator_.get_params()
best_parameters = {
param_name: best_parameters[param_name] for param_name in sorted(parameters.keys())
}
print(f"Best parameters set: {best_parameters}")

bento_model = bentoml.sklearn.save_model(
"twenty_news_group",
grid_search.best_estimator_,
signatures={
"predict": {"batchable": True, "batch_dim": 0},
"predict_proba": {"batchable": True, "batch_dim": 0},
},
custom_objects={
"target_names": data.target_names,
},
metadata=best_parameters,
)
print(f"Model saved: {bento_model}")

# Test running inference with BentoML runner
test_runner = bentoml.sklearn.get("twenty_news_group:latest").to_runner()
test_runner.init_local()
assert test_runner.predict.run(["hello"]) == grid_search.best_estimator_.predict(
["hello"]
)

bento_model = bentoml.sklearn.save_model(
"twenty_news_group_second",
grid_search.best_estimator_,
signatures={
"predict": {"batchable": True, "batch_dim": 0},
"predict_proba": {"batchable": True, "batch_dim": 0},
},
custom_objects={
"target_names": data.target_names,
},
metadata=best_parameters,
)
print(f"Model saved: {bento_model}")

# Test running inference with BentoML runner
test_runner = bentoml.sklearn.get("twenty_news_group_second:latest").to_runner()
test_runner.init_local()
assert test_runner.predict.run(["hello"]) == grid_search.best_estimator_.predict(
["hello"]
)
Empty file.