Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About WN18RR #6

Closed
guolingbing opened this issue Jan 12, 2018 · 5 comments
Closed

About WN18RR #6

guolingbing opened this issue Jan 12, 2018 · 5 comments

Comments

@guolingbing
Copy link

It seems that some enities in the testing set does not appear in the training set. So, about 210 number of triples in testing set are meaningless?

@TimDettmers
Copy link
Owner

Good catch. I just checked this and it is true. 212 entities in the test set do not occur in the training set. Since the dataset has already been used in some other papers I would not want to adjust it now. If everybody works with these unpredictable test cases it should even out and scores will be comparable (albeit being low). I will add a comment about this in the README. Thank you.

@xptree
Copy link

xptree commented May 10, 2018

How do you deal with triplets whose entities do not appear in training set during evaluation? Simply ignore them or assign them a specific score, say 0. Thank you.

@TimDettmers
Copy link
Owner

Just treat them like any other triple. The model will probably not be able to rank them correctly (I would expect a random rank), but that is no issue as long as everybody evaluates those triple in that way. Note that, although not much better, random ranks are better than to assign zero scores. If you having problems with the triples not being in the vocabulary (embedding matrix) then include test set triples in the vocabulary — this is how I deal with the issue in this repo.

@xptree
Copy link

xptree commented May 10, 2018

Thanks!

It seems that the results are still depend on how you assign random ranks (how you choose random seeds), although the dependence may be insignificant.

@TimDettmers
Copy link
Owner

Yes, I agree. It could induce bias, but it is unlikely I think. Thank you for this question, I think it will be helpful for others in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants