Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix quantize file types #4502

Merged
merged 2 commits into from
May 20, 2024
Merged

fix quantize file types #4502

merged 2 commits into from
May 20, 2024

Conversation

mxyng
Copy link
Contributor

@mxyng mxyng commented May 17, 2024

this fixes the reported file type (quantization). previously this will report f16 or f32 based on the input file despite going through quantization

this changes contains some changes suggested by @pdevine in #4330

@mxyng mxyng force-pushed the mxyng/fix-quantize branch 2 times, most recently from 882041e to 6a6d762 Compare May 18, 2024 07:13
@mxyng mxyng changed the base branch from main to mxyng/cache-intermediate-layers May 20, 2024 18:06
@mxyng mxyng force-pushed the mxyng/cache-intermediate-layers branch from 0aba2d5 to 3520c0e Compare May 20, 2024 20:25
server/images.go Outdated
return errors.New("quantization failed")
}

intermediateBlobs.Store(baseLayer.Layer.Digest, layers[0].Layer.Digest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this depends on #4330

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this will upstream it to #4330

}

intermediateBlobs.Store(baseLayer.Layer.Digest, layers[0].Layer.Digest)
baseLayer.Layer = layers[0].Layer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for layers being a slice here? To accommodate multimodal models? Im having a bit of trouble easily following all the different layers being passed around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one reason. The other is to maintain a common interface between this model from a file and model from another model

Base automatically changed from mxyng/cache-intermediate-layers to main May 20, 2024 20:54
@mxyng mxyng merged commit 2f81b3d into main May 20, 2024
12 checks passed
@mxyng mxyng deleted the mxyng/fix-quantize branch May 20, 2024 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants