Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

你好,我觉得squadexample这个类里面是不是有些问题? #35

Open
BCWang93 opened this issue Jun 4, 2021 · 2 comments
Open

Comments

@BCWang93
Copy link

BCWang93 commented Jun 4, 2021

我觉得squadexample这个类里面的这里有些问题吧
for c in self.context_text:

就这里转换位置的时候,因为context_text是个字符串,如果字符串里面没有空格,那么得到的char_to_word_offset里面就全都是0和1.。。这里是不是有一些问题啊?因为我看您在create example的时候并没有进行特殊处理,只是原始的字符串。我运行了一下发现会出现这种问题。谢谢!

@sherlcok314159
Copy link

image

这里不会有问题,因为后面会进行tokenizer.tokenize,从而达到更细粒度的切分

@frog-painter
Copy link

确实感觉有问题,对于中文数据集,_improve_answer_span直接退化成find函数了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants