Experiments to accelerate GPU device for PyTorch training
-
Updated
Dec 15, 2021 - Jupyter Notebook
Experiments to accelerate GPU device for PyTorch training
A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation for Efficient Hardware Acceleration on Edge Devices
Deep learning solution for Cassava Leaf Disease Classification, a Kaggle's Research Code Competition using Tensorflow.
You Only Look Once: Unified, Real-Time Object Detection
PyTorch RNet implementation with Distributed and Mixed-Precision training support.
Hybrid-Precision Analysis on CG Solver (H.A.C.S). Merging single and double precision to generate a fast yet accurate CG solver
Fast SGEMM emulation on Tensor Cores
This repository contains notebooks showing how to perform mixed precision training in tf.keras 2.0
This is the open source version of HPL-MXP. The code performance has been verified on Frontier
Let's train CIFAR 10 Pytorch with Half-Precision!
Extremely simple and understandable GPT2 implementation with minor tweaks
PyCon SG 2019 Tutorial: Optimizing TensorFlow Performance
An implementation of HPL-AI Mixed-Precision Benchmark based on hpl-2.3
PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications
CMix-NN: Mixed Low-Precision CNN Library for Memory-Constrained Edge Devices
FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme
BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.
🎯 Accumulated Gradients for TensorFlow 2
<케라스 창시자에게 배우는 딥러닝 2판> 도서의 코드 저장소
基于tensorflow1.x的预训练模型调用,支持单机多卡、梯度累积,XLA加速,混合精度。可灵活训练、验证、预测。
Add a description, image, and links to the mixed-precision topic page so that developers can more easily learn about it.
To associate your repository with the mixed-precision topic, visit your repo's landing page and select "manage topics."