docs: init io-descriptors example (#4646)

* docs: init io-descriptors example rebase to main * examples: update io-descriptor * ci: auto fixes from pre-commit.ci For more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
bentoml · Apr 29, 2024 · 020552d · 020552d
1 parent 9661e04
commit 020552d
Show file tree

Hide file tree

Showing 4 changed files with 251 additions and 0 deletions.
diff --git a/examples/io-descriptors/README.md b/examples/io-descriptors/README.md
@@ -0,0 +1,129 @@
+# BentoML Input/Output Types Tutorial
+
+BentoML supports a wide range of data types when creating a Service API. The data types can be catagorized as follows:
+- Python Standards: `str`, `int`, `float`, `list`, `dict` etc.
+- Pydantic field types: see [Pydantic types documentation](https://field-idempotency--pydantic-docs.netlify.app/usage/types/).
+- ML specific types: `nummpy.ndarray`, `torch.Tensor` , `tf.Tensor` for tensor data, `pd.DataFrame` for tabular data, `PIL.Image.Image` for
+Image data, and `pathlib.Path` for files such as audios, images, and pdfs.
+
+When creating a Bentoml Service, you should use Python's type annotations to define the expected input and output types for each API endpoint. This
+step can not only help validate the data against the specified schema, but also enhances the clarity and readability of your code. Type annotations play
+an important role in generating the BentoML API, client, and Service UI components, ensuring a consitent and predictable interaction with the Service.
+
+You can also use `pydantic.Field` to set additional information about service parameters, such as default values and descriptions. This improves the API's
+usability and provides basic documentation.
+
+In this tutorial, you will learn how to set different input and output types for BentoML Services.
+
+## Installing Dependencies
+
+Let's start with the environment. We recommend using virtual environment for better package handling.
+
+```bash
+python -m venv io-descriptors-example
+source io-descriptors-example/bin/activate
+pip install -r requirements.txt
+```
+
+## Running a Service
+7 different API Services are implemented in `service.py`, with diversed input/output types. When running, you should specified the class name of the Service
+you'd like to run inside `bentofile.yaml`.
+
+```yaml
+service: "service.py:AudioSpeedUp"
+include:
+  - "service.py"
+```
+
+In the above configuration through `bentofile.yaml`, we're running the `AudioSpeedUp` Service, which you can find on line 62 of `service.py`. When running a different
+Service, simply replace `AudioSpeedUp` with the class name of the Service.
+
+For example, if you want to run the first Service `ImageResize`, you can configure the `bentofile.yaml` as follows:
+
+```yaml
+service: "service.py:ImageResize"
+include:
+  - "service.py"
+```
+
+After you finished configuring `bentofile.yaml`, run `bentoml serve .` to deploy the Service locally. You can then interact with the auto-generated swagger UI to play
+around with each different API endpoints.
+
+## Different data types
+
+### Standard Python types
+
+The following demonstrates a simple addtion Service, with both inputs and output as float parameters. You can
+obviously change the type annotation to `int`, `str` etc. to get familiar with the interaction between type
+annotaions and the auto-generated Swagger UI when deploying locally.\
+
+```python
+@bentoml.service()
+class AdditionService:
+
+    @bentoml.api()
+    def add(self, num1: float, num2: float) -> float:
+        return num1 + num2
+```
+
+### Files
+
+Files are handled through `pathlib.Path` in BentoML (which means you should handle the file as a file path in your API implementation as well as on the client side).
+Most file types can be specified through `bentoml.validators.Contentype(<file_type>)`. The input of this function follows the standard of the
+request format (such as `text/plain`, `application/pdf`, `audio/mp3` etc.).
+
+##### Appending Strings to File example
+```python
+@bentoml.service()
+class AppendStringToFile:
+
+    @bentoml.api()
+    def append_string_to_eof(
+        self,
+        txt_file: t.Annotated[Path, bentoml.validators.ContentType("text/plain")], input_string: str
+        ) -> t.Annotated[Path, bentoml.validators.ContentType("text/plain")]:
+        with open(txt_file, "a") as file:
+            file.write(input_string)
+        return txt_file
+```
+
+Within `service.py`, example API Services with 4 different file types are implemented (audio, image, text file, and pdf file). The functionality of each Service
+is quite simple and self-explanatory.
+
+Notice that for class `ImageResize`, two different API endpoints are implemented. This is because BentoML can support images parameters directly through
+`PIL.Image.Image`, which means that image objects can be directly passed through clients, instead of a file object.
+
+The last two Services are examples of having `numpy.ndarray` or `pandas.DataFrame` as input parameters. Since they all work quite similarly with the above examples,
+we will not specifically explain them in this tutorial. You can try to write revise the Service with `torch.Tensor` as input to check your understanding.
+
+To serve the these examples locally, run `bentoml serve .`
+
+```bash
+$ bentoml serve .
+
+2024-03-22T19:25:24+0000 [INFO] [cli] Starting production HTTP BentoServer from "service:ImageResize" listening on http://localhost:3000 (Press CTRL+C to quit)
+```
+
+Open your web browser at http://0.0.0.0:3000 to view the Swagger UI for sending test requests.
+
+You may also send request with `curl` command or any HTTP client, e.g.:
+
+```bash
+curl -X 'POST' \
+  'http://localhost:3000/transpose' \
+  -H 'accept: application/json' \
+  -H 'Content-Type: application/json' \
+  -d '{
+  "tensor": [
+    [0, 1, 2, 3],
+    [4, 5, 6, 7]
+  ]
+}'
+```
+
+## Deploy to BentoCloud
+Run the following command to deploy this example to BentoCloud for better management and scalability. [Sign up](https://www.bentoml.com/) if you haven't got a BentoCloud account.
+```bash
+bentoml deploy .
+```
+For more information, see [Create Deployments](https://docs.bentoml.com/en/latest/bentocloud/how-tos/create-deployments.html).
diff --git a/examples/io-descriptors/bentofile.yaml b/examples/io-descriptors/bentofile.yaml
@@ -0,0 +1,3 @@
+service: "service.py:AudioSpeedUp"
+include:
+  - "service.py"
diff --git a/examples/io-descriptors/requirements.txt b/examples/io-descriptors/requirements.txt
@@ -0,0 +1,8 @@
+diffusers
+bentoml
+transformers
+torch
+accelerate
+pydub
+pdf2img
+pandas
diff --git a/examples/io-descriptors/service.py b/examples/io-descriptors/service.py
@@ -0,0 +1,111 @@
+import typing as t
+from pathlib import Path
+
+import numpy as np
+import pandas as pd
+import torch
+from PIL import Image as im
+from PIL.Image import Image
+from pydantic import Field
+
+import bentoml
+from bentoml.validators import DataframeSchema
+from bentoml.validators import DType
+
+
+@bentoml.service()
+class ImageResize:
+    @bentoml.api()
+    def generate(self, image: Image, height: int = 64, width: int = 64) -> Image:
+        size = height, width
+        return image.resize(size, im.LANCZOS)
+
+    @bentoml.api()
+    def generate_with_path(
+        self,
+        image: t.Annotated[Path, bentoml.validators.ContentType("image/jpeg")],
+        height: int = 64,
+        width: int = 64,
+    ) -> Image:
+        size = height, width
+        image = im.open(image)
+        return image.resize(size, im.LANCZOS)
+
+
+@bentoml.service()
+class AdditionService:
+    @bentoml.api()
+    def add(self, num1: float, num2: float) -> float:
+        return num1 + num2
+
+
+@bentoml.service()
+class AppendStringToFile:
+    @bentoml.api()
+    def append_string_to_eof(
+        self,
+        context: bentoml.Context,
+        txt_file: t.Annotated[Path, bentoml.validators.ContentType("text/plain")],
+        input_string: str,
+    ) -> t.Annotated[Path, bentoml.validators.ContentType("text/plain")]:
+        with open(output_path, "a") as file:
+            file.write(input_string)
+        return output_path
+
+
+@bentoml.service()
+class PDFtoImage:
+    @bentoml.api()
+    def pdf_first_page_as_image(
+        self,
+        pdf: t.Annotated[Path, bentoml.validators.ContentType("application/pdf")],
+    ) -> Image:
+        from pdf2image import convert_from_path
+
+        pages = convert_from_path(pdf)
+        return pages[0].resize(pages[0].size, im.ANTIALIAS)
+
+
+@bentoml.service()
+class AudioSpeedUp:
+    @bentoml.api()
+    def speed_up_audio(
+        self,
+        context: bentoml.Context,
+        audio: t.Annotated[Path, bentoml.validators.ContentType("audio/mpeg")],
+        velocity: float,
+    ) -> t.Annotated[Path, bentoml.validators.ContentType("audio/mp3")]:
+        import os
+
+        from pydub import AudioSegment
+
+        output_path = os.path.join(context.temp_dir, "output.mp3")
+        sound = AudioSegment.from_file(audio)
+        sound = sound.speedup(velocity)
+        sound.export(output_path, format="mp3")
+        return Path(output_path)
+
+
+@bentoml.service()
+class TransposeTensor:
+    @bentoml.api()
+    def transpose(
+        self,
+        tensor: t.Annotated[torch.Tensor, DType("float32")] = Field(
+            description="A 2x4 tensor with float32 dtype"
+        ),
+    ) -> np.ndarray:
+        return torch.transpose(tensor, 0, 1).numpy()
+
+
+@bentoml.service()
+class CountRowsDF:
+    @bentoml.api()
+    def count_rows(
+        self,
+        input: t.Annotated[
+            pd.DataFrame,
+            DataframeSchema(orient="records", columns=["dummy1", "dummy2"]),
+        ],
+    ) -> int:
+        return len(input)