Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QA Chinese model result does not match python version #108

Open
Tonghua-Li opened this issue Jan 2, 2022 · 2 comments
Open

QA Chinese model result does not match python version #108

Tonghua-Li opened this issue Jan 2, 2022 · 2 comments

Comments

@Tonghua-Li
Copy link

Tonghua-Li commented Jan 2, 2022

Using this Chinese model
This model runs on python locally, the output is correct, but from spaGO is not.

Similar to #101, but I cannot find the bool parameter for QA,
how to turn off output is forced to be a distribution (sum must be 1), whereas with Python, the output is free?

server := bert.NewServer(model)
answers := s.model.Answer(body.Question, body.Passage)

Translated QA:
Context: My name is Clara, I live in Berkeley
Q: what is my name?
A: Clara

Output is supposed to be
克拉拉
but got

{
    "answers": [
        {
            "text": "我叫克拉拉,我住在伯克利。",
            "start": 0,
            "end": 13,
            "confidence": 0.2547743
        },
        {
            "text": "住在伯克利。",
            "start": 7,
            "end": 13,
            "confidence": 0.22960596
        },
        {
            "text": "我叫克拉拉,我住",
            "start": 0,
            "end": 8,
            "confidence": 0.1548344
        }
    ],
    "took": 1075
}

./bert-server server --repo=~/.spago --model=luhua/chinese_pretrain_mrc_roberta_wwm_ext_large --tls-disable

PASSAGE="我叫克拉拉,我住在伯克利。"                                                                                                                                                 
QUESTION1="我的名字是什么?" 
curl -k -d '{"question": "'"$QUESTION1"'", "passage": "'"$PASSAGE"'"}' -H "Content-Type: application/json" "http://127.0.0.1:1987/answer?pretty"
@Tonghua-Li Tonghua-Li changed the title QA model result does not match python version QA Chinese model result does not match python version Jan 2, 2022
@matteo-grella
Copy link
Member

Thanks @Tonghua-Li to experiment spaGO on Chinese models! Let me take a look.

In the meantime, did you check already if the output from the tokenization matches the one in Python/Rust?

@Tonghua-Li
Copy link
Author

@matteo-grella , I am new to NLP and tensorflow.

Here is the comparison between spago and python, lengths are different. Anything else I should check?

spago

startLogits, length= 13
endLogits length = 13 

python

startLogits, length= 128 
[7.526767253875732, -11.148331642150879, -11.519922256469727, -11.718563079833984, -11.908085823059082, -12.035258293151855, -11.453763008117676, -11.869075775146484, -11.3169527053833, -10.268500328063965, -10.09360408782959, -11.069799423217773, -6.754544734954834, -11.262563705444336, ...]
endLogits length= 128 
[8.138846397399902, -11.649543762207031, -11.485342025756836, -11.634878158569336, -11.620648384094238, -11.807802200317383, -11.955061912536621, -11.538698196411133, -10.995415687561035, -10.638959884643555, -11.329093933105469, -10.720544815063477, -11.54542350769043, -9.438825607299805, ...]


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants