fix quantize file types #4502

mxyng · 2024-05-17T18:31:40Z

this fixes the reported file type (quantization). previously this will report f16 or f32 based on the input file despite going through quantization

this changes contains some changes suggested by @pdevine in #4330

pdevine · 2024-05-20T20:27:43Z

server/images.go

+							return errors.New("quantization failed")
+						}
+
+						intermediateBlobs.Store(baseLayer.Layer.Digest, layers[0].Layer.Digest)


I'm assuming this depends on #4330

Yes this will upstream it to #4330

BruceMacD · 2024-05-20T20:30:34Z

server/images.go

+						}
+
+						intermediateBlobs.Store(baseLayer.Layer.Digest, layers[0].Layer.Digest)
+						baseLayer.Layer = layers[0].Layer


What's the reason for layers being a slice here? To accommodate multimodal models? Im having a bit of trouble easily following all the different layers being passed around.

That's one reason. The other is to maintain a common interface between this model from a file and model from another model

mxyng force-pushed the mxyng/fix-quantize branch 2 times, most recently from 882041e to 6a6d762 Compare May 18, 2024 07:13

mxyng changed the base branch from main to mxyng/cache-intermediate-layers May 20, 2024 18:06

mxyng force-pushed the mxyng/cache-intermediate-layers branch from 0aba2d5 to 3520c0e Compare May 20, 2024 20:25

mxyng force-pushed the mxyng/fix-quantize branch from 6a6d762 to 1410d8d Compare May 20, 2024 20:26

pdevine reviewed May 20, 2024

View reviewed changes

BruceMacD reviewed May 20, 2024

View reviewed changes

Base automatically changed from mxyng/cache-intermediate-layers to main May 20, 2024 20:54

jmorganca requested review from pdevine and BruceMacD May 20, 2024 21:14

mxyng added 2 commits May 20, 2024 15:15

tidy intermediate blobs

f36f1d6

fix quantize file types

807d092

mxyng force-pushed the mxyng/fix-quantize branch from 1410d8d to 807d092 Compare May 20, 2024 22:22

mxyng mentioned this pull request May 20, 2024

tidy intermediate blobs #4546

Closed

jmorganca approved these changes May 20, 2024

View reviewed changes

mxyng merged commit 2f81b3d into main May 20, 2024
12 checks passed

mxyng deleted the mxyng/fix-quantize branch May 20, 2024 23:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix quantize file types #4502

fix quantize file types #4502

mxyng commented May 17, 2024 •

edited

pdevine May 20, 2024

mxyng May 20, 2024

BruceMacD May 20, 2024

mxyng May 20, 2024

fix quantize file types #4502

fix quantize file types #4502

Conversation

mxyng commented May 17, 2024 • edited

pdevine May 20, 2024

Choose a reason for hiding this comment

mxyng May 20, 2024

Choose a reason for hiding this comment

BruceMacD May 20, 2024

Choose a reason for hiding this comment

mxyng May 20, 2024

Choose a reason for hiding this comment

mxyng commented May 17, 2024 •

edited