New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modifying the models hyperparameters #124
Comments
Yes, you can change the model hyperparameter directly in this file https://github.com/intel/neural-speed/blob/main/neural_speed/convert/convert_llama.py#L1159-L1176. Take llama as an example. Just modify and re-run this script to get a customized gguf file.
Yes. when you get a gguf model file, you can run this script This script will use the tokenizer inside of the gguf. |
Thank you very much, @Zhenzhong1 ! |
Hi @Zhenzhong1 so I tried what you advised and :
I saw all the parameters for loading the model, but not the inference parameters (temperature, top_p, etc...)
Do I need a customized .gguf file for this command line to run, or one I just downloaded as is would work (it currently doesn't : |
Nevermind my second question, I forgot it had to be Q4 ^^ |
Ah, and while I'm at it, when using CLI, if I get an error, it freezes the terminal. Would you have a trick to avoid that ? |
@benjamin27315k just type |
you don't need a customized GGUF if you only want to modify inference parameters. Inference parameters are input args, just modify them in the command line. Please check this https://github.com/intel/neural-speed/blob/main/docs/advanced_usage.md for more inference parameters. |
Hello there,
I'm new to neural speed, coming from llama-cpp-python, and i encounter some problems (probably due to a misunderstanding on my side).
I don't want to flood you with issues so I'll start with my two main questions :
Thank you !
The text was updated successfully, but these errors were encountered: