llama convert add rotary_scaling param in cli_args #1385

activezhao · 2024-04-01T03:09:25Z

In the convert_checkpoint.py of llama, if we define the args with command, the rotary_scaling param can not be missed in some situations, so rotary_scaling param should be added to avoid the special situations.

For example, deepseek-coder-6.7b-base, it needs the rotary_scaling param.

  "rope_scaling": {
    "factor": 4.0,
    "type": "linear"
  }

Otherwise, there will be a lot of duplicate tokens during inference.

curl -X POST localhost:8620/v2/models/ensemble/generate -d '{"text_input": "def quick_sort", "max_tokens": 10, "bad_words": "", "stop_words": "", "stream": true, "temperature": 0.2, "return_log_probs": true, "top_p": 0.75, "end_id": [32022]}'

"text_output":"sortsortsortsortsortsortsortsortsortsort"

nv-guomingz · 2024-05-15T11:00:32Z

Hi @activezhao , thanks for contributing to TRT-LLM project.

TRT-LLM refactored the checkpoint generation logic during the past months.
Now, the latest code logic will read the rope_scaling field from hf config.json automatically (https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/models/llama/convert.py#L1196) and we don't allow the user to set this field manually.

Would u please have a retry on deepseek-coder-6.7b-base with latest TRT-LLM?

activezhao · 2024-05-20T15:22:37Z

Hi @activezhao , thanks for contributing to TRT-LLM project.

TRT-LLM refactored the checkpoint generation logic during the past months.
Now, the latest code logic will read the rope_scaling field from hf config.json automatically (https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/models/llama/convert.py#L1196) and we don't allow the user to set this field manually.

Would u please have a retry on deepseek-coder-6.7b-base with latest TRT-LLM?

@nv-guomingz OK, I will try it.
Thanks.

llama convert add rotary_scaling param in cli_args

db0b7b1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama convert add rotary_scaling param in cli_args #1385

llama convert add rotary_scaling param in cli_args #1385

activezhao commented Apr 1, 2024

nv-guomingz commented May 15, 2024

activezhao commented May 20, 2024

llama convert add rotary_scaling param in cli_args #1385

Are you sure you want to change the base?

llama convert add rotary_scaling param in cli_args #1385

Conversation

activezhao commented Apr 1, 2024

nv-guomingz commented May 15, 2024

activezhao commented May 20, 2024