Examples coder.py and mixtral.py give error on loading mistral models. #2741

pbittyscu · 2023-12-13T05:04:43Z

pbittyscu
Dec 13, 2023

Hi All -
I've been trying to reproduce GeoHot's mixtral streaming experiment on my M2 (Thank you GeoHot! Keep the streams coming.) and ran into some issues with the coder.py and mixtral.py after the 0fd4425 commit:

program_source:15:80: error: as_type cast from '__bf16' to 'unsigned short' is not allowed
(data0+(gidx032768)+(gidx164)+(lidx24096)+(lidx34)+3) = ((unsigned int)(as_type(val3))(unsigned int)(65536));

I understand that my M2 doesn't support bFloat16 (except for in Beta) and there is an issue opened to address this and also that the commit takes out the hack to deal with bfloat16 but if I revert the state:line 75 through 81 in the commit above the example/coder.py works again and Quentin LIVES! I was wondering if the coder.py function fix_bf16 was intended to address the issue in coder.py. Happy to try to post a fix for this. I would try to add an argument to torch_load to pass a cast function in with the default function being something like ret.cpu().reshape(intermediate_shape).permute(permute_indexes)

Maybe there's a hotfix for this somewhere already?

Let me know your thoughts.
Patrick

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examples coder.py and mixtral.py give error on loading mistral models. #2741

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Examples coder.py and mixtral.py give error on loading mistral models. #2741

pbittyscu Dec 13, 2023

Replies: 0 comments

pbittyscu
Dec 13, 2023