Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: init io-descriptors example (#4646)
* docs: init io-descriptors example rebase to main * examples: update io-descriptor * ci: auto fixes from pre-commit.ci For more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information
1 parent
9661e04
commit 020552d
Showing
4 changed files
with
251 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# BentoML Input/Output Types Tutorial | ||
|
||
BentoML supports a wide range of data types when creating a Service API. The data types can be catagorized as follows: | ||
- Python Standards: `str`, `int`, `float`, `list`, `dict` etc. | ||
- Pydantic field types: see [Pydantic types documentation](https://field-idempotency--pydantic-docs.netlify.app/usage/types/). | ||
- ML specific types: `nummpy.ndarray`, `torch.Tensor` , `tf.Tensor` for tensor data, `pd.DataFrame` for tabular data, `PIL.Image.Image` for | ||
Image data, and `pathlib.Path` for files such as audios, images, and pdfs. | ||
|
||
When creating a Bentoml Service, you should use Python's type annotations to define the expected input and output types for each API endpoint. This | ||
step can not only help validate the data against the specified schema, but also enhances the clarity and readability of your code. Type annotations play | ||
an important role in generating the BentoML API, client, and Service UI components, ensuring a consitent and predictable interaction with the Service. | ||
|
||
You can also use `pydantic.Field` to set additional information about service parameters, such as default values and descriptions. This improves the API's | ||
usability and provides basic documentation. | ||
|
||
In this tutorial, you will learn how to set different input and output types for BentoML Services. | ||
|
||
## Installing Dependencies | ||
|
||
Let's start with the environment. We recommend using virtual environment for better package handling. | ||
|
||
```bash | ||
python -m venv io-descriptors-example | ||
source io-descriptors-example/bin/activate | ||
pip install -r requirements.txt | ||
``` | ||
|
||
## Running a Service | ||
7 different API Services are implemented in `service.py`, with diversed input/output types. When running, you should specified the class name of the Service | ||
you'd like to run inside `bentofile.yaml`. | ||
|
||
```yaml | ||
service: "service.py:AudioSpeedUp" | ||
include: | ||
- "service.py" | ||
``` | ||
|
||
In the above configuration through `bentofile.yaml`, we're running the `AudioSpeedUp` Service, which you can find on line 62 of `service.py`. When running a different | ||
Service, simply replace `AudioSpeedUp` with the class name of the Service. | ||
|
||
For example, if you want to run the first Service `ImageResize`, you can configure the `bentofile.yaml` as follows: | ||
|
||
```yaml | ||
service: "service.py:ImageResize" | ||
include: | ||
- "service.py" | ||
``` | ||
|
||
After you finished configuring `bentofile.yaml`, run `bentoml serve .` to deploy the Service locally. You can then interact with the auto-generated swagger UI to play | ||
around with each different API endpoints. | ||
|
||
## Different data types | ||
|
||
### Standard Python types | ||
|
||
The following demonstrates a simple addtion Service, with both inputs and output as float parameters. You can | ||
obviously change the type annotation to `int`, `str` etc. to get familiar with the interaction between type | ||
annotaions and the auto-generated Swagger UI when deploying locally.\ | ||
|
||
```python | ||
@bentoml.service() | ||
class AdditionService: | ||
|
||
@bentoml.api() | ||
def add(self, num1: float, num2: float) -> float: | ||
return num1 + num2 | ||
``` | ||
|
||
### Files | ||
|
||
Files are handled through `pathlib.Path` in BentoML (which means you should handle the file as a file path in your API implementation as well as on the client side). | ||
Most file types can be specified through `bentoml.validators.Contentype(<file_type>)`. The input of this function follows the standard of the | ||
request format (such as `text/plain`, `application/pdf`, `audio/mp3` etc.). | ||
|
||
##### Appending Strings to File example | ||
```python | ||
@bentoml.service() | ||
class AppendStringToFile: | ||
|
||
@bentoml.api() | ||
def append_string_to_eof( | ||
self, | ||
txt_file: t.Annotated[Path, bentoml.validators.ContentType("text/plain")], input_string: str | ||
) -> t.Annotated[Path, bentoml.validators.ContentType("text/plain")]: | ||
with open(txt_file, "a") as file: | ||
file.write(input_string) | ||
return txt_file | ||
``` | ||
|
||
Within `service.py`, example API Services with 4 different file types are implemented (audio, image, text file, and pdf file). The functionality of each Service | ||
is quite simple and self-explanatory. | ||
|
||
Notice that for class `ImageResize`, two different API endpoints are implemented. This is because BentoML can support images parameters directly through | ||
`PIL.Image.Image`, which means that image objects can be directly passed through clients, instead of a file object. | ||
|
||
The last two Services are examples of having `numpy.ndarray` or `pandas.DataFrame` as input parameters. Since they all work quite similarly with the above examples, | ||
we will not specifically explain them in this tutorial. You can try to write revise the Service with `torch.Tensor` as input to check your understanding. | ||
|
||
To serve the these examples locally, run `bentoml serve .` | ||
|
||
```bash | ||
$ bentoml serve . | ||
|
||
2024-03-22T19:25:24+0000 [INFO] [cli] Starting production HTTP BentoServer from "service:ImageResize" listening on http://localhost:3000 (Press CTRL+C to quit) | ||
``` | ||
|
||
Open your web browser at http://0.0.0.0:3000 to view the Swagger UI for sending test requests. | ||
|
||
You may also send request with `curl` command or any HTTP client, e.g.: | ||
|
||
```bash | ||
curl -X 'POST' \ | ||
'http://localhost:3000/transpose' \ | ||
-H 'accept: application/json' \ | ||
-H 'Content-Type: application/json' \ | ||
-d '{ | ||
"tensor": [ | ||
[0, 1, 2, 3], | ||
[4, 5, 6, 7] | ||
] | ||
}' | ||
``` | ||
|
||
## Deploy to BentoCloud | ||
Run the following command to deploy this example to BentoCloud for better management and scalability. [Sign up](https://www.bentoml.com/) if you haven't got a BentoCloud account. | ||
```bash | ||
bentoml deploy . | ||
``` | ||
For more information, see [Create Deployments](https://docs.bentoml.com/en/latest/bentocloud/how-tos/create-deployments.html). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
service: "service.py:AudioSpeedUp" | ||
include: | ||
- "service.py" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
diffusers | ||
bentoml | ||
transformers | ||
torch | ||
accelerate | ||
pydub | ||
pdf2img | ||
pandas |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
import typing as t | ||
from pathlib import Path | ||
|
||
import numpy as np | ||
import pandas as pd | ||
import torch | ||
from PIL import Image as im | ||
from PIL.Image import Image | ||
from pydantic import Field | ||
|
||
import bentoml | ||
from bentoml.validators import DataframeSchema | ||
from bentoml.validators import DType | ||
|
||
|
||
@bentoml.service() | ||
class ImageResize: | ||
@bentoml.api() | ||
def generate(self, image: Image, height: int = 64, width: int = 64) -> Image: | ||
size = height, width | ||
return image.resize(size, im.LANCZOS) | ||
|
||
@bentoml.api() | ||
def generate_with_path( | ||
self, | ||
image: t.Annotated[Path, bentoml.validators.ContentType("image/jpeg")], | ||
height: int = 64, | ||
width: int = 64, | ||
) -> Image: | ||
size = height, width | ||
image = im.open(image) | ||
return image.resize(size, im.LANCZOS) | ||
|
||
|
||
@bentoml.service() | ||
class AdditionService: | ||
@bentoml.api() | ||
def add(self, num1: float, num2: float) -> float: | ||
return num1 + num2 | ||
|
||
|
||
@bentoml.service() | ||
class AppendStringToFile: | ||
@bentoml.api() | ||
def append_string_to_eof( | ||
self, | ||
context: bentoml.Context, | ||
txt_file: t.Annotated[Path, bentoml.validators.ContentType("text/plain")], | ||
input_string: str, | ||
) -> t.Annotated[Path, bentoml.validators.ContentType("text/plain")]: | ||
with open(output_path, "a") as file: | ||
file.write(input_string) | ||
return output_path | ||
|
||
|
||
@bentoml.service() | ||
class PDFtoImage: | ||
@bentoml.api() | ||
def pdf_first_page_as_image( | ||
self, | ||
pdf: t.Annotated[Path, bentoml.validators.ContentType("application/pdf")], | ||
) -> Image: | ||
from pdf2image import convert_from_path | ||
|
||
pages = convert_from_path(pdf) | ||
return pages[0].resize(pages[0].size, im.ANTIALIAS) | ||
|
||
|
||
@bentoml.service() | ||
class AudioSpeedUp: | ||
@bentoml.api() | ||
def speed_up_audio( | ||
self, | ||
context: bentoml.Context, | ||
audio: t.Annotated[Path, bentoml.validators.ContentType("audio/mpeg")], | ||
velocity: float, | ||
) -> t.Annotated[Path, bentoml.validators.ContentType("audio/mp3")]: | ||
import os | ||
|
||
from pydub import AudioSegment | ||
|
||
output_path = os.path.join(context.temp_dir, "output.mp3") | ||
sound = AudioSegment.from_file(audio) | ||
sound = sound.speedup(velocity) | ||
sound.export(output_path, format="mp3") | ||
return Path(output_path) | ||
|
||
|
||
@bentoml.service() | ||
class TransposeTensor: | ||
@bentoml.api() | ||
def transpose( | ||
self, | ||
tensor: t.Annotated[torch.Tensor, DType("float32")] = Field( | ||
description="A 2x4 tensor with float32 dtype" | ||
), | ||
) -> np.ndarray: | ||
return torch.transpose(tensor, 0, 1).numpy() | ||
|
||
|
||
@bentoml.service() | ||
class CountRowsDF: | ||
@bentoml.api() | ||
def count_rows( | ||
self, | ||
input: t.Annotated[ | ||
pd.DataFrame, | ||
DataframeSchema(orient="records", columns=["dummy1", "dummy2"]), | ||
], | ||
) -> int: | ||
return len(input) |