Skip to content
View efrantar's full-sized avatar

Organizations

@IST-DASLab
Block or Report

Block or report efrantar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

  1. IST-DASLab/gptq IST-DASLab/gptq Public

    Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

    Python 1.8k 141

  2. IST-DASLab/sparsegpt IST-DASLab/sparsegpt Public

    Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

    Python 651 80

  3. IST-DASLab/marlin IST-DASLab/marlin Public

    FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

    Python 388 30

  4. IST-DASLab/qmoe IST-DASLab/qmoe Public

    Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

    Python 253 22

  5. IST-DASLab/OBC IST-DASLab/OBC Public

    Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".

    Python 89 11

  6. rob-twophase rob-twophase Public

    The ultimate Rubik's Cube solving algorithm for high-speed axial robots.

    C++ 115 9