-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix quantize file types #4502
fix quantize file types #4502
Conversation
882041e
to
6a6d762
Compare
0aba2d5
to
3520c0e
Compare
server/images.go
Outdated
return errors.New("quantization failed") | ||
} | ||
|
||
intermediateBlobs.Store(baseLayer.Layer.Digest, layers[0].Layer.Digest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming this depends on #4330
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this will upstream it to #4330
} | ||
|
||
intermediateBlobs.Store(baseLayer.Layer.Digest, layers[0].Layer.Digest) | ||
baseLayer.Layer = layers[0].Layer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for layers being a slice here? To accommodate multimodal models? Im having a bit of trouble easily following all the different layers being passed around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's one reason. The other is to maintain a common interface between this model from a file and model from another model
this fixes the reported file type (quantization). previously this will report f16 or f32 based on the input file despite going through quantization
this changes contains some changes suggested by @pdevine in #4330