ptq

Star

Here are 14 public repositories matching this topic...

Xilinx / brevitas

Star

Brevitas: neural network quantization in PyTorch

fpga deep-learning pytorch neural-networks xilinx quantization hardware-acceleration qat brevitas ptq

Updated Jun 12, 2024
Python

Bobo-y / flexible-yolov5

Star

More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam，dcn and so on), and tensorrt

sparsity backbone pytorch resnet object-detection gcn tensorrt neck qat shufflenet yolov3 cbam hrnet dcnv2 yolov5 moblienet swin-transformer triton-server ptq

Updated May 8, 2024
Python

Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.

machine-learning deep-neural-networks deep-learning neural-network tensorflow optimizer pytorch quantization qat network-quantization network-compression edge-ai ptq

Updated Jun 10, 2024
Python

ModelTC / llmc

Star

This is the official PyTorch implementation of "LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models", and also an efficient LLM compression tool with various advanced compression methods, supporting multiple inference backends.

Updated Jun 12, 2024
Python

yester31 / TensorRT_API

Star

Deep Learning Model Optimization Using by TensorRT API, window

cuda pytorch vgg resnet quantization unet tensorrt yolov5 detr ptq yolov6

Updated Aug 29, 2022
Python

MAGICS-LAB / OutEffHop

Star

[ICML 2024] Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

transformer outliers attention attention-mechanism outlier-removal outlier hopfield-neural-network ptq outlier-treatment modern-hopfield-networks modern-hopfield-model icml-2024 softmax-1 quantized-friendly no-op-outlier

Updated Jun 5, 2024
Python

yester31 / Quantization_EX

Star

quantization example for pqt & qat

quantization tensorrt int8 qat model-optimization quantization-aware-training post-training-quantization pytorch-quantization ptq

Updated Jul 24, 2023
Python

yester31 / TensorRT_ONNX

Star

Generating tensorrt model using onnx

pytorch quantization tensorrt onnx int8-inference onnxruntime post-training-quantization int8-quantization tensorrt-inference ptq

Updated Jun 22, 2023
C++

BlindOver / blindover_AI

Star

Build AI model to classify beverages for blind individuals

ai deep-learning mobile-app pytorch classification resnet quantization qat shufflenetv2 mobilenetv3 efficientnet ptq

Updated Aug 16, 2023
Python

smpanaro / norm-tweaking

Star

Post post-training-quantization (PTQ) method for improving LLMs. Unofficial implementation of https://arxiv.org/abs/2309.02784

quantization post-training-quantization ptq llms

Updated Feb 21, 2024
Python

yester31 / TensorRT_Sparse

Star

inference with the structured sparsity and quantization

quantization tensorrt structured-sparsity sparsity-pattern ptq sparse-tensor-cores sparse-int8-model accelerate-the-inference

Updated Aug 30, 2023
Python

OmidGhadami95 / EfficientNetV2_Quantization_CK

Star

EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.