GPT-J 6B inference on TensorRT with INT-8 precision
-
Updated
Apr 5, 2023 - Python
GPT-J 6B inference on TensorRT with INT-8 precision
Compressed CNNs for airplane classification in satellite images (APoZ-based parameter pruning, INT8 weight quantization)
Generating tensorrt model using onnx
TensorRT Int8 Python version sample. TensorRT Int8 Python 实现例子。TensorRT Int8 Pythonの例です
it has support for openvino converted model of yolov7-int.xml ,yolov7x,
将端上模型部署过程中,常见的问题以及解决办法记录并汇总,希望能给其他人带来一点帮助。
Quantization Aware Training
BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
Real-time portrait segmentation for mobile devices
Generate a quantization parameter file for ncnn framework int8 inference
Add a description, image, and links to the int8-inference topic page so that developers can more easily learn about it.
To associate your repository with the int8-inference topic, visit your repo's landing page and select "manage topics."