Modified export.py to add the ability to export fp16 weights. #345

rdentato · 2023-08-23T17:26:41Z

Added a --version 3 to the parameters that is like legacy_export but exports data in fp16 format.

Warning! needs test. I have too little memory (64GB are needed for a 7B parameters model and I couldn't find a smaller one)

kroggen · 2023-08-23T20:21:58Z

I am against creating a new version. It should export to existing versions, but choosing the output format (fp32, fp16, int8...)

Current code still expect the freq_cis to be there in the final file

I also think you mean 64 GB instead of MB. BTW this OOM problem is tracked on #341

rdentato · 2023-08-23T21:01:02Z

It will be the "legacy version" but with fp16 weights.
Because of how export.py works, you need to give a version number to it:

usage: export.py [-h] [--version VERSION] (--checkpoint CHECKPOINT | --meta-llama META_LLAMA | --hf HF) filepath

Changing the command line parameters of export.py could be the topic of another PR. For example having a "name" and a "version" for each model format that can be exported.

rdentato · 2023-08-23T21:13:05Z

Just thought of another way, but I'm not sure I like it: use the extension of the output file to determine the fp32/fp16 size.
For example:

python export.py --hf mrm8488/llama-2-coder-7b llama-2-coder-7b.f16

could signal --version 0 (the default) but with weights in fp16 format. The code would be a little bit uglier but if this could serve to avoid confusion ...

kroggen · 2023-08-23T21:57:34Z

Why your serialize_half is not named serialize_fp16 to be similar with the existing serialize_fp32?

Why your function is not together with the other 2 in the beginning of the file?

kroggen · 2023-08-23T22:02:20Z

My attempt is at #347

rdentato · 2023-08-24T06:14:50Z

About the half, is to stay consistent with the torch.half in the code. I liked it better (I ad fp16 initially).

for the position, I don't like to disrupt existing code (at least not for something so minimal) and just added at the end of the block. But in this case I agree I should have put it earlier, together with the other two, moved up.

Added a versione (3) that is like legacy but exports to fp16.

e67165f

rdentato changed the title ~~Added a versione (3) that is like legacy but exports to fp16.~~ Modified export.py to export to fp16. Aug 23, 2023

re-added the freq_cis values.

d47f3e0

rdentato changed the title ~~Modified export.py to export to fp16.~~ Modified export.py to add the ability to export fp16 weights. Aug 23, 2023

rdentato added 2 commits August 24, 2023 06:16

Moved serialize_half() closer to the other two.

61d018a

Fixed comment in legacy_export_half()

213ddfb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modified export.py to add the ability to export fp16 weights. #345

Modified export.py to add the ability to export fp16 weights. #345

rdentato commented Aug 23, 2023 •

edited

kroggen commented Aug 23, 2023

rdentato commented Aug 23, 2023 •

edited

rdentato commented Aug 23, 2023 •

edited

kroggen commented Aug 23, 2023

kroggen commented Aug 23, 2023

rdentato commented Aug 24, 2023 •

edited

Modified export.py to add the ability to export fp16 weights. #345

Are you sure you want to change the base?

Modified export.py to add the ability to export fp16 weights. #345

Conversation

rdentato commented Aug 23, 2023 • edited

Warning! needs test. I have too little memory (64GB are needed for a 7B parameters model and I couldn't find a smaller one)

kroggen commented Aug 23, 2023

rdentato commented Aug 23, 2023 • edited

rdentato commented Aug 23, 2023 • edited

kroggen commented Aug 23, 2023

kroggen commented Aug 23, 2023

rdentato commented Aug 24, 2023 • edited

rdentato commented Aug 23, 2023 •

edited

rdentato commented Aug 23, 2023 •

edited

rdentato commented Aug 23, 2023 •

edited

rdentato commented Aug 24, 2023 •

edited