Modifying the models hyperparameters #124

benjamin27315k · 2024-02-16T14:46:19Z

Hello there,

I'm new to neural speed, coming from llama-cpp-python, and i encounter some problems (probably due to a misunderstanding on my side).

I don't want to flood you with issues so I'll start with my two main questions :

is there a way to change the model hyperparameters (the temperature, mostly) ?
is there a way to not use a tokenizer coming from HF, and instead do like llama-cpp and use the tokenizer included in the .gguf file ? (in my use case, I'd like to not depend on an external lib)

Thank you !

Zhenzhong1 · 2024-02-18T05:25:52Z

@benjamin27315k Hi

is there a way to change the model hyperparameters (the temperature, mostly) ?

Yes, you can change the model hyperparameter directly in this file https://github.com/intel/neural-speed/blob/main/neural_speed/convert/convert_llama.py#L1159-L1176. Take llama as an example. Just modify and re-run this script to get a customized gguf file.

is there a way to not use a tokenizer coming from HF, and instead do like llama-cpp and use the tokenizer included in the .gguf file ? (in my use case, I'd like to not depend on an external lib)

Yes. when you get a gguf model file, you can run this script
python scripts\inference.py --model_name llama2 -m ggml-model-q4_0.gguf -n 512 -p "Building a website can be done in 10 simple steps:"

This script will use the tokenizer inside of the gguf.

benjamin27315k · 2024-02-19T08:44:25Z

Thank you very much, @Zhenzhong1 !
I'll try that and keep you posted if anything goes wrong 👍

benjamin27315k · 2024-02-21T10:11:51Z

Hi @Zhenzhong1

so I tried what you advised and :

Yes, you can change the model hyperparameter directly in this file https://github.com/intel/neural-speed/blob/main/neural_speed/convert/convert_llama.py#L1159-L1176. Take llama as an example. Just modify and re-run this script to get a customized gguf file.

I saw all the parameters for loading the model, but not the inference parameters (temperature, top_p, etc...)

Yes. when you get a gguf model file, you can run this script python scripts\inference.py --model_name llama2 -m ggml-model-q4_0.gguf -n 512 -p "Building a website can be done in 10 simple steps:"

Do I need a customized .gguf file for this command line to run, or one I just downloaded as is would work (it currently doesn't : error loading model: unrecognized tensor type 10)(ah, sorry, I'm trying to use this model : llama-2-7b-chat.Q2_K.gguf, taken from TheBloke on HF)

benjamin27315k · 2024-02-21T10:18:21Z

Nevermind my second question, I forgot it had to be Q4 ^^

benjamin27315k · 2024-02-21T10:38:32Z

Ah, and while I'm at it, when using CLI, if I get an error, it freezes the terminal. Would you have a trick to avoid that ?

Zhenzhong1 · 2024-02-22T02:56:26Z

Ah, and while I'm at it, when using CLI, if I get an error, it freezes the terminal. Would you have a trick to avoid that ?

@benjamin27315k just type stty echo in command line, will be ok.

Zhenzhong1 · 2024-02-22T02:59:32Z

saw all the parameters for loading the model, but not the inference parameters (temperature, top_p, etc...)

@benjamin27315k

you don't need a customized GGUF if you only want to modify inference parameters.

Inference parameters are input args, just modify them in the command line. Please check this https://github.com/intel/neural-speed/blob/main/docs/advanced_usage.md for more inference parameters.

Zhenzhong1 self-assigned this Feb 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modifying the models hyperparameters #124

Modifying the models hyperparameters #124

benjamin27315k commented Feb 16, 2024

Zhenzhong1 commented Feb 18, 2024 •

edited

benjamin27315k commented Feb 19, 2024

benjamin27315k commented Feb 21, 2024 •

edited

benjamin27315k commented Feb 21, 2024

benjamin27315k commented Feb 21, 2024

Zhenzhong1 commented Feb 22, 2024 •

edited

Zhenzhong1 commented Feb 22, 2024 •

edited

Modifying the models hyperparameters #124

Modifying the models hyperparameters #124

Comments

benjamin27315k commented Feb 16, 2024

Zhenzhong1 commented Feb 18, 2024 • edited

benjamin27315k commented Feb 19, 2024

benjamin27315k commented Feb 21, 2024 • edited

benjamin27315k commented Feb 21, 2024

benjamin27315k commented Feb 21, 2024

Zhenzhong1 commented Feb 22, 2024 • edited

Zhenzhong1 commented Feb 22, 2024 • edited

Zhenzhong1 commented Feb 18, 2024 •

edited

benjamin27315k commented Feb 21, 2024 •

edited

Zhenzhong1 commented Feb 22, 2024 •

edited

Zhenzhong1 commented Feb 22, 2024 •

edited