-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
使用官方脚本对应数据集提示列名不匹配,使用同格式自定义数据集报错 #692
Comments
补充:修改映射为
之后再使用自定义数据集仍然报错:
|
@LumenScope 首先,是 map fn 的定义方式,mmengine 的 config 没有办法在 config 文件内定义新的函数,只能通过 import 的方式,具体见 https://github.com/InternLM/xtuner/tree/main/examples/demo_data/multi_turn_2#config 其次,对于自定义的数据集,可以通过 最后,可以通过 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
获取数据集到本地的代码:
获取到的JSONL样例展示:
运行以下脚本
NPROC_PER_NODE=4 xtuner train /work/tzz/xtuner/config/qwen1_5_14b_chat_qlora_alpaca_e3_copy.py --deepspeed deepspeed_zero3^C
报错:
将数据集换成自定义数据集:
由于我观察到映射并未使用text,因此我没有加入此字段:
报错:
提示列明不一致,但是映射并未看到使用
text
使用原本默认数据集会报错。
The text was updated successfully, but these errors were encountered: