测试本地hf模型，怎么确定合适的max-out-len，比如storycloze[Feature] #1140

dh12306 · 2024-05-11T07:38:40Z

Describe the feature

加载本地的hf模型之后，想测试storycloze_gen数据，但不知道max-out-len该设置多少，其他数据，比如mmlu也是不确定该参数应该多少，这个应该怎么确定呢，可以给每个任务一个默认的预测长度吗，合理的默认长度

--datasets storycloze_gen  --max-out-len ???
--datasets mmlu_gen  --max-out-len ???

Will you implement it?

I would like to implement this feature and create a PR!

The text was updated successfully, but these errors were encountered:

jingmingzhuo · 2024-05-11T08:36:22Z

It's hard to set a prefect 'max-out-len' for a task, for different models have different preference. When model A prefer to give the answer directly, models B may give final answer after long chain of thoughts.

A better way to detect the setting of 'max-out-len' is to run some examples on the tested model and see the average response length, as a specific model always have similar response lengths for the same task.

Still, there are some practiced conclusions:

For multiple choice question tasks like StoryCloze or MMLU, a length of 100 will satisfy the vast majority of cases.
For matematical problems like MATH and GSM8K, a length of 1024 may be better due to the model need a long reasoning process to get final answers.
For some subjective benchmarks like MTbench or Alpaca_eval, it also needs to be set to a very long length, as some questions will require the model to design a very detailed program or write a well-developed code script.

mm-assistant bot assigned tonysy May 11, 2024

jingmingzhuo closed this as completed May 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

测试本地hf模型，怎么确定合适的max-out-len，比如storycloze[Feature] #1140

测试本地hf模型，怎么确定合适的max-out-len，比如storycloze[Feature] #1140

dh12306 commented May 11, 2024

jingmingzhuo commented May 11, 2024

测试本地hf模型，怎么确定合适的max-out-len，比如storycloze[Feature] #1140

测试本地hf模型，怎么确定合适的max-out-len，比如storycloze[Feature] #1140

Comments

dh12306 commented May 11, 2024

Describe the feature

Will you implement it?

jingmingzhuo commented May 11, 2024