Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: init io-descriptors example #4646

Merged
merged 3 commits into from Apr 29, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
120 changes: 120 additions & 0 deletions examples/io-descriptors/README.md
@@ -0,0 +1,120 @@
# BentoML Input/Output Types Tutorial

A BentoML service supports a wide range of data types when creating a Service API. The data types can be catagorized as follows:
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved
- Python Standards: `str`, `int`, `float`, `list`, `dict` etc.
- Pydantic field types: see [Pydantic types documentation](https://field-idempotency--pydantic-docs.netlify.app/usage/types/).
- ML specific types: `nummpy.ndarray`, `torch.Tensor` , `tf.Tensor` for tensor data, `pd.DataFrame` for tabular data, `PIL.Image.Image` for
Image data, and `pathlib.Path` for files such as audios, images, and pdfs.

When creating a Bentoml service, you should use Python's type annotations to define the expected input and output types for each API endpoint. This
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved
step can not only helps validate the data against the specified schema, but also enhances the clarity and readability of your code. Type annotations play
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved
an important role in generating the BentoML API, client, and service UI components, ensuring a consitent and predictable interaction with the Service.

You can also use `pydantic.Field` to set additional information about servicee parameters, such as default values and descriptions. This improves the API's
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved
usability and provides basic documentation. See the following examples for details.
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved

## Running this example
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved

Let's start with the environment. We recommend using virtual environment for better package handling.

```bash
python -m venv io-descriptors-example
source io-descriptors-example/bin/activate
pip install -r requirements.txt
```

7 different API services are implemented within this example, with diversed input/output types. When running, you should specified the class name of the service
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved
you'd like to run inside `bentofile.yaml`.

```yaml
service: "service.py:AudioSpeedUp"
include:
- "service.py"
```

In the above configuration through `bentofile.yaml`, we're running the `AudioSpeedUp` service, which you can find on line 62 of `service.py`. When running a different
service, simply replace `AudioSpeedUp` with the class name of the service.

For example, if you want to run the first service `ImageResize`, you can configure the `bentofile.yaml` as follows:

```yaml
service: "service.py:ImageResize"
include:
- "service.py"
```

After you finished configuring `bentofile.yaml`, run `bentoml serve .` to deploy the service locally. You can then interact with the auto-generated swagger UI to play
around with each different API endpoints.

## Different data types

### Standard Python types

The following demonstrates a simple addtion service, with both inputs and output as float parameters. You can
obviously change the type annotation to `int`, `str` etc. to get familiar with the interaction between type
annotaions and the auto-generated Swagger UI when deploying locally.\
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved

```python
@bentoml.service()
class AdditionService:

@bentoml.api()
def add(self, num1: float, num2: float) -> float:
return num1 + num2
```

### Files

Files are handled through `pathlib.Path` in BentoML (which means you should handle the file as a file path in your API implementation as well as on the client side).
Most file types can be specified through `bentoml.validators.Contentype(<file_type>)`. The input of this function follows the standard of the
request format (such as `text/plain`, `application/pdf`, `audio/mp3` etc.).

##### Appending Strings to File example
```python
@bentoml.service()
class AppendStringToFile:

@bentoml.api()
def append_string_to_eof(
self,
txt_file: t.Annotated[Path, bentoml.validators.ContentType("text/plain")], input_string: str
) -> t.Annotated[Path, bentoml.validators.ContentType("text/plain")]:
with open(txt_file, "a") as file:
file.write(input_string)
return txt_file
```

Within `service.py`, example API services with 4 different file types are implemented (audio, image, text file, and pdf file). The functionality of each service
is quite simple and self-explainatory.
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved

Notice that for class `ImageResize`, two different API endpoints are implemented. This is because BentoML can support images parameters directly through
`PIL.Image.Image`, which means that image objects can be directly passed through clients, instead of a file object.

The last two services are examples of having `numpy.ndarray` or `pandas.DataFrame` as input parameters. Since they all work quite similarly with the above examples,
we will not specifically explain them in this tutorial. You can try to write revise the service with `torch.Tensor` as input to check your understanding.

To serve the these examples locally, run `bentoml serve .`

```bash
bentoml serve .
2024-03-22T19:25:24+0000 [INFO] [cli] Starting production HTTP BentoServer from "service:ImageResize" listening on http://localhost:3000 (Press CTRL+C to quit)
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved
```

Open your web browser at http://0.0.0.0:3000 to view the Swagger UI for sending test requests.

You may also send request with `curl` command or any HTTP client, e.g.:

```bash
curl -X 'POST' \
'http://0.0.0.0:3000/classify_generated_texts' \
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved
-H 'accept: application/json' \
-H 'Content-Type: text/plain' \
-d 'I have an idea!'
```

## Deploy to BentoCloud
Run the following command to deploy this example to BentoCloud for better management and scalability. [Sign up](https://www.bentoml.com/) if you haven't got a BentoCloud account.
```bash
bentoml deploy .
```
For more information, see [Create Deployments](https://docs.bentoml.com/en/latest/bentocloud/how-tos/create-deployments.html).
3 changes: 3 additions & 0 deletions examples/io-descriptors/bentofile.yaml
@@ -0,0 +1,3 @@
service: "service.py:AudioSpeedUp"
include:
- "service.py"
7 changes: 7 additions & 0 deletions examples/io-descriptors/requirements.txt
@@ -0,0 +1,7 @@
diffusers
bentoml
transformers
torch
accelerate
pydub
pdf2img
108 changes: 108 additions & 0 deletions examples/io-descriptors/service.py
@@ -0,0 +1,108 @@
import typing as t
from pathlib import Path

import numpy as np
import pandas as pd
import torch
from PIL import Image as im
from PIL.Image import Image
from pydantic import Field

import bentoml
from bentoml.validators import DataframeSchema
from bentoml.validators import DType

# PROMPT_TEMPLATE = """<image>\nUSER: What's the content of the image?\nASSISTANT:"""


@bentoml.service()
class ImageResize:
@bentoml.api()
def generate(self, image: Image, height: str = "64", width: str = "64") -> Image:
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved
size = int(height), int(width)
return image.resize(size, im.ANTIALIAS)

@bentoml.api()
def generate_with_path(
self,
image: t.Annotated[Path, bentoml.validators.ContentType("image/jpeg")],
height: str = "64",
width: str = "64",
) -> Image:
size = int(height), int(width)
image = im.open(image)
return image.resize(size, im.ANTIALIAS)


@bentoml.service()
class AdditionService:
@bentoml.api()
def add(self, num1: float, num2: float) -> float:
return num1 + num2


@bentoml.service()
class AppendStringToFile:
@bentoml.api()
def append_string_to_eof(
self,
txt_file: t.Annotated[Path, bentoml.validators.ContentType("text/plain")],
input_string: str,
) -> t.Annotated[Path, bentoml.validators.ContentType("text/plain")]:
with open(txt_file, "a") as file:
file.write(input_string)
lycheel1 marked this conversation as resolved.
Show resolved Hide resolved
return txt_file


@bentoml.service()
class PDFtoImage:
@bentoml.api()
def pdf_first_page_as_image(
self,
pdf: t.Annotated[Path, bentoml.validators.ContentType("application/pdf")],
) -> Image:
from pdf2image import convert_from_path

pages = convert_from_path(pdf)
return pages[0].resize(pages[0].size, im.ANTIALIAS)


@bentoml.service()
class AudioSpeedUp:
@bentoml.api()
def speed_up_audio(
self,
audio: t.Annotated[Path, bentoml.validators.ContentType("audio/mpeg")],
velocity: float,
) -> t.Annotated[Path, bentoml.validators.ContentType("audio/mp3")]:
from pydub import AudioSegment

sound = AudioSegment.from_file(audio) # type:
sound = sound.speedup(velocity)
sound.export("output.mp3", format="mp3")
return Path("output.mp3")


@bentoml.service()
class TransposeTensor:
@bentoml.api()
def transpose(
self,
tensor: t.Annotated[torch.Tensor, DType("float32")] = Field(
description="A 2x4 tensor with float32 dtype"
),
) -> np.ndarray:
return torch.transpose(tensor, 0, 1).numpy()


@bentoml.service()
class CountRowsDF:
@bentoml.api()
def count_rows(
self,
input: t.Annotated[
pd.DataFrame,
DataframeSchema(orient="records", columns=["dummy1", "dummy2"]),
],
) -> int:
return len(input)