Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.58bit algorithm implement recommend #46

Open
princepride opened this issue Mar 15, 2024 · 2 comments
Open

1.58bit algorithm implement recommend #46

princepride opened this issue Mar 15, 2024 · 2 comments

Comments

@princepride
Copy link

def ste(self, x):

I noticed that you're attempting to implement 1.58-bit quantization, but it seems you only quantize the values during the forward pass, then proceed with the model inference, using the original values for the backward pass. In 4-bit quantization, we store two quantized values in one byte for representation, and the computation and gradients of the new data type are implemented with CUDA. You should consider this approach as well. Keep it up, I'm rooting for you.

@AwaitFuture
Copy link

I think what youre saying is that youre saying is that they should use the 1.58 bit quantization for the backward pass? Its not really talked about in the paper for whatever reason, but the 1.58bit quantization destroys the gradient, making backprop with it impossible, so they keep the original weights to do backprop, while using the quantized weights for forward pass.

Copy link

Stale issue message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants