Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMB + Qwen1.5-72B-Chat got empty answers #1141

Open
2 tasks done
qy1026 opened this issue May 11, 2024 · 1 comment
Open
2 tasks done

CMB + Qwen1.5-72B-Chat got empty answers #1141

qy1026 opened this issue May 11, 2024 · 1 comment
Assignees

Comments

@qy1026
Copy link

qy1026 commented May 11, 2024

Prerequisite

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

python -c "import opencompass.utils;import pprint;pprint.pprint(dict(opencompass.utils.collect_env()))"

Reproduces the problem - code/configuration sample

CUDA_VISIBLE_DEVICES="0,1,2,3" python run.py
--datasets cmb_gen_dfb5c4
--hf-path "/Qwen1.5-72B-Chat/"
--model-kwargs device_map='auto' trust_remote_code=True
--tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True
--max-out-len 100
--max-seq-len 2048
--batch-size1
--num-gpus 4

Reproduces the problem - command or script

CUDA_VISIBLE_DEVICES="0,1,2,3" python run.py
--datasets cmb_gen_dfb5c4
--hf-path "/Qwen1.5-72B-Chat/"
--model-kwargs device_map='auto' trust_remote_code=True
--tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True
--max-out-len 100
--max-seq-len 2048
--batch-size1
--num-gpus 4

Reproduces the problem - error message

CMB + qwen1.5-72b-chat: test acc=0.26%, almost all the predictions in cmb_test_k.json are "" (no answers).
examples:
"21": {
"origin_prompt": "以下是中国医师考试中规培结业考试的一道多项选择题,不需要做任何分析和解释,直接输出答案选项。\n每一精神症状均有明确定义,并具有以下特点\nA. 症状的出现不受病人意识控制\nB. 症状出现可受病人意识控制\nC. 症状可以通过转移的方法使其消失\nD. 症状内容与周围环境不相称\nE. 症状给病人带来不同程度的功能损害 \n 答案: ",
"prediction": "",
"gold": "NULL"
},
"22": {
"origin_prompt": "以下是中国医师考试中规培结业考试的一道单项选择题,不需要做任何分析和解释,直接输出答案选项。\n关于慢性粒细胞白血病,错误的是\nA. 造血干细胞恶性克隆性疾病\nB. 自然病程仅数月\nC. 分为慢性期、加速期和急变期\nD. 最显著的体征是脾大\nE. 血象白细胞持续增高 \n 答案: ",
"prediction": "",
"gold": "NULL"
},
"23": {
"origin_prompt": "以下是中国医师考试中规培结业考试的一道单项选择题,不需要做任何分析和解释,直接输出答案选项。\n确定颌位关系包括\nA. 定位平面记录\nB. 下颌后退记录\nC. 面下1/3高度记录\nD. 垂直距离和下颌前伸(牙合)记录\nE. 垂直距离和正中关系记录 \n 答案: ",
"prediction": "",
"gold": "NULL"
},

While CMB + qwen1.5-32b-chat is normal with an acc around 52%

Other information

No response

@qy1026 qy1026 changed the title CMB + Qwen1.5-72B-Chat CMB + Qwen1.5-72B-Chat got empty answers May 11, 2024
@bittersweet1999
Copy link
Collaborator

You can try to set do_sample = True in model's generation_kwargs and see whether have differences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants