Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

通过_知识预训练实践_教程下载的dkplm/train_corpus.txt等2个数据集是错的吗? #311

Open
LemonWade opened this issue Apr 7, 2023 · 2 comments
Assignees

Comments

@LemonWade
Copy link

LemonWade commented Apr 7, 2023

下载下来的训练集都是一下这句话。

{'text': '通常来说,人类想获得针对某种的[ENT]特异性抗体[ENT]有两种方式,要么是通过自然感染,要么是通过[ENT]疫苗接种[ENT]。但是,我们显然不会让婴幼儿冒着生病的危险去主动感染某个病毒,而对于 3 岁以下婴幼儿,目前各国尚没有[ENT]新冠疫苗[ENT]获批使用。', 'relation_id': [1, 2, 3], 'replced_entity_id': [1, 2, 3]}。

请问有没有解决办法?提前感谢!

还是说训练集就是这种重复的一句话。

@LemonWade LemonWade changed the title 通过_知识预训练实践_教程下载的dkplm/train_corpus.txt等4个数据集是错的。 通过_知识预训练实践_教程下载的dkplm/train_corpus.txt等2个数据集是错的吗? Apr 7, 2023
@chywang
Copy link
Collaborator

chywang commented Apr 7, 2023

@ztl-35

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants