Hybrid-Precision Analysis on CG Solver (H.A.C.S). Merging single and double precision to generate a fast yet accurate CG solver
-
Updated
May 29, 2020 - C++
Hybrid-Precision Analysis on CG Solver (H.A.C.S). Merging single and double precision to generate a fast yet accurate CG solver
Fast SGEMM emulation on Tensor Cores
Experiments to accelerate GPU device for PyTorch training
PyTorch RNet implementation with Distributed and Mixed-Precision training support.
A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation for Efficient Hardware Acceleration on Edge Devices
You Only Look Once: Unified, Real-Time Object Detection
Deep learning solution for Cassava Leaf Disease Classification, a Kaggle's Research Code Competition using Tensorflow.
This repository contains notebooks showing how to perform mixed precision training in tf.keras 2.0
FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme
PyCon SG 2019 Tutorial: Optimizing TensorFlow Performance
Extremely simple and understandable GPT2 implementation with minor tweaks
This is the open source version of HPL-MXP. The code performance has been verified on Frontier
An implementation of HPL-AI Mixed-Precision Benchmark based on hpl-2.3
Let's train CIFAR 10 Pytorch with Half-Precision!
PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications
CMix-NN: Mixed Low-Precision CNN Library for Memory-Constrained Edge Devices
🎯 Accumulated Gradients for TensorFlow 2
BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.
基于tensorflow1.x的预训练模型调用,支持单机多卡、梯度累积,XLA加速,混合精度。可灵活训练、验证、预测。
High Resolution Style Transfer in PyTorch with Color Control and Mixed Precision 🎨
Add a description, image, and links to the mixed-precision topic page so that developers can more easily learn about it.
To associate your repository with the mixed-precision topic, visit your repo's landing page and select "manage topics."