Skip to content

Commit

Permalink
chore: documentation for release
Browse files Browse the repository at this point in the history
Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
  • Loading branch information
narendasan committed Feb 14, 2024
1 parent b50d12f commit 4c3d026
Show file tree
Hide file tree
Showing 340 changed files with 148,758 additions and 10 deletions.
Empty file added docs/v2.2.0/.nojekyll
Empty file.
942 changes: 942 additions & 0 deletions docs/v2.2.0/_cpp_api/classtorch__tensorrt_1_1DataType.html

Large diffs are not rendered by default.

879 changes: 879 additions & 0 deletions docs/v2.2.0/_cpp_api/classtorch__tensorrt_1_1Device_1_1DeviceType.html

Large diffs are not rendered by default.

911 changes: 911 additions & 0 deletions docs/v2.2.0/_cpp_api/classtorch__tensorrt_1_1TensorFormat.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

753 changes: 753 additions & 0 deletions docs/v2.2.0/_cpp_api/dir_cpp.html

Large diffs are not rendered by default.

754 changes: 754 additions & 0 deletions docs/v2.2.0/_cpp_api/dir_cpp_include.html

Large diffs are not rendered by default.

757 changes: 757 additions & 0 deletions docs/v2.2.0/_cpp_api/dir_cpp_include_torch_tensorrt.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

809 changes: 809 additions & 0 deletions docs/v2.2.0/_cpp_api/file_cpp_include_torch_tensorrt_logging.h.html

Large diffs are not rendered by default.

795 changes: 795 additions & 0 deletions docs/v2.2.0/_cpp_api/file_cpp_include_torch_tensorrt_macros.h.html

Large diffs are not rendered by default.

806 changes: 806 additions & 0 deletions docs/v2.2.0/_cpp_api/file_cpp_include_torch_tensorrt_ptq.h.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

756 changes: 756 additions & 0 deletions docs/v2.2.0/_cpp_api/namespace_torch.html

Large diffs are not rendered by default.

804 changes: 804 additions & 0 deletions docs/v2.2.0/_cpp_api/namespace_torch_tensorrt.html

Large diffs are not rendered by default.

785 changes: 785 additions & 0 deletions docs/v2.2.0/_cpp_api/namespace_torch_tensorrt__logging.html

Large diffs are not rendered by default.

781 changes: 781 additions & 0 deletions docs/v2.2.0/_cpp_api/namespace_torch_tensorrt__ptq.html

Large diffs are not rendered by default.

782 changes: 782 additions & 0 deletions docs/v2.2.0/_cpp_api/namespace_torch_tensorrt__torchscript.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

927 changes: 927 additions & 0 deletions docs/v2.2.0/_cpp_api/structtorch__tensorrt_1_1Device.html

Large diffs are not rendered by default.

785 changes: 785 additions & 0 deletions docs/v2.2.0/_cpp_api/structtorch__tensorrt_1_1GraphInputs.html

Large diffs are not rendered by default.

1,106 changes: 1,106 additions & 0 deletions docs/v2.2.0/_cpp_api/structtorch__tensorrt_1_1Input.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

1,152 changes: 1,152 additions & 0 deletions docs/v2.2.0/_cpp_api/torch_tensort_cpp.html

Large diffs are not rendered by default.

842 changes: 842 additions & 0 deletions docs/v2.2.0/_cpp_api/unabridged_orphan.html

Large diffs are not rendered by default.

@@ -0,0 +1,158 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n\n# Compiling ResNet using the Torch-TensorRT `torch.compile` Backend\n\nThis interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a ResNet model.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports and Model Definition\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import torch\nimport torch_tensorrt\nimport torchvision.models as models"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Initialize model with half precision and sample inputs\nmodel = models.resnet18(pretrained=True).half().eval().to(\"cuda\")\ninputs = [torch.randn((1, 3, 224, 224)).to(\"cuda\").half()]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Optional Input Arguments to `torch_tensorrt.compile`\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Enabled precision for TensorRT optimization\nenabled_precisions = {torch.half}\n\n# Whether to print verbose logs\ndebug = True\n\n# Workspace size for TensorRT\nworkspace_size = 20 << 30\n\n# Maximum number of TRT Engines\n# (Lower value allows more graph segmentation)\nmin_block_size = 7\n\n# Operations to Run in Torch, regardless of converter support\ntorch_executed_ops = {}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Compilation with `torch_tensorrt.compile`\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Build and compile the model with torch.compile, using Torch-TensorRT backend\noptimized_model = torch_tensorrt.compile(\n model,\n ir=\"torch_compile\",\n inputs=inputs,\n enabled_precisions=enabled_precisions,\n debug=debug,\n workspace_size=workspace_size,\n min_block_size=min_block_size,\n torch_executed_ops=torch_executed_ops,\n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Equivalently, we could have run the above via the torch.compile frontend, as so:\n`optimized_model = torch.compile(model, backend=\"torch_tensorrt\", options={\"enabled_precisions\": enabled_precisions, ...}); optimized_model(*inputs)`\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Inference\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Does not cause recompilation (same batch size as input)\nnew_inputs = [torch.randn((1, 3, 224, 224)).half().to(\"cuda\")]\nnew_outputs = optimized_model(*new_inputs)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Does cause recompilation (new batch size)\nnew_batch_size_inputs = [torch.randn((8, 3, 224, 224)).half().to(\"cuda\")]\nnew_batch_size_outputs = optimized_model(*new_batch_size_inputs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Cleanup\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Finally, we use Torch utilities to clean up the workspace\ntorch._dynamo.reset()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Cuda Driver Error Note\n\nOccasionally, upon exiting the Python runtime after Dynamo compilation with `torch_tensorrt`,\none may encounter a Cuda Driver Error. This issue is related to https://github.com/NVIDIA/TensorRT/issues/2052\nand can be resolved by wrapping the compilation/inference in a function and using a scoped call, as in::\n\n if __name__ == '__main__':\n compile_engine_and_infer()\n\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
@@ -0,0 +1,107 @@
"""
.. _torch_compile_advanced_usage:
Torch Compile Advanced Usage
======================================================
This interactive script is intended as an overview of the process by which `torch_tensorrt.compile(..., ir="torch_compile", ...)` works, and how it integrates with the `torch.compile` API."""

# %%
# Imports and Model Definition
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

import torch
import torch_tensorrt

# %%


# We begin by defining a model
class Model(torch.nn.Module):
def __init__(self) -> None:
super().__init__()
self.relu = torch.nn.ReLU()

def forward(self, x: torch.Tensor, y: torch.Tensor):
x_out = self.relu(x)
y_out = self.relu(y)
x_y_out = x_out + y_out
return torch.mean(x_y_out)


# %%
# Compilation with `torch.compile` Using Default Settings
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

# Define sample float inputs and initialize model
sample_inputs = [torch.rand((5, 7)).cuda(), torch.rand((5, 7)).cuda()]
model = Model().eval().cuda()

# %%

# Next, we compile the model using torch.compile
# For the default settings, we can simply call torch.compile
# with the backend "torch_tensorrt", and run the model on an
# input to cause compilation, as so:
optimized_model = torch.compile(model, backend="torch_tensorrt", dynamic=False)
optimized_model(*sample_inputs)

# %%
# Compilation with `torch.compile` Using Custom Settings
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

# First, we use Torch utilities to clean up the workspace
# after the previous compile invocation
torch._dynamo.reset()

# Define sample half inputs and initialize model
sample_inputs_half = [
torch.rand((5, 7)).half().cuda(),
torch.rand((5, 7)).half().cuda(),
]
model_half = Model().eval().cuda()

# %%

# If we want to customize certain options in the backend,
# but still use the torch.compile call directly, we can provide
# custom options to the backend via the "options" keyword
# which takes in a dictionary mapping options to values.
#
# For accepted backend options, see the CompilationSettings dataclass:
# py/torch_tensorrt/dynamo/_settings.py
backend_kwargs = {
"enabled_precisions": {torch.half},
"debug": True,
"min_block_size": 2,
"torch_executed_ops": {"torch.ops.aten.sub.Tensor"},
"optimization_level": 4,
"use_python_runtime": False,
}

# Run the model on an input to cause compilation, as so:
optimized_model_custom = torch.compile(
model_half,
backend="torch_tensorrt",
options=backend_kwargs,
dynamic=False,
)
optimized_model_custom(*sample_inputs_half)

# %%
# Cleanup
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

# Finally, we use Torch utilities to clean up the workspace
torch._dynamo.reset()

# %%
# Cuda Driver Error Note
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
# Occasionally, upon exiting the Python runtime after Dynamo compilation with `torch_tensorrt`,
# one may encounter a Cuda Driver Error. This issue is related to https://github.com/NVIDIA/TensorRT/issues/2052
# and can be resolved by wrapping the compilation/inference in a function and using a scoped call, as in::
#
# if __name__ == '__main__':
# compile_engine_and_infer()
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@@ -0,0 +1,55 @@
"""
.. _torch_compile_stable_diffusion:
Torch Compile Stable Diffusion
======================================================
This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a Stable Diffusion model. A sample output is featured below:
.. image:: /tutorials/images/majestic_castle.png
:width: 512px
:height: 512px
:scale: 50 %
:align: right
"""

# %%
# Imports and Model Definition
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

import torch
from diffusers import DiffusionPipeline

import torch_tensorrt

model_id = "CompVis/stable-diffusion-v1-4"
device = "cuda:0"

# Instantiate Stable Diffusion Pipeline with FP16 weights
pipe = DiffusionPipeline.from_pretrained(
model_id, revision="fp16", torch_dtype=torch.float16
)
pipe = pipe.to(device)

backend = "torch_tensorrt"

# Optimize the UNet portion with Torch-TensorRT
pipe.unet = torch.compile(
pipe.unet,
backend=backend,
options={
"truncate_long_and_double": True,
"precision": torch.float16,
},
dynamic=False,
)

# %%
# Inference
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

prompt = "a majestic castle in the clouds"
image = pipe(prompt).images[0]

image.save("images/majestic_castle.png")
image.show()
Binary file not shown.
Binary file not shown.

0 comments on commit 4c3d026

Please sign in to comment.