Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choice of Dataset: SVO, WikiData, FB15-237 #1

Open
adarshmathew opened this issue May 23, 2020 · 0 comments
Open

Choice of Dataset: SVO, WikiData, FB15-237 #1

adarshmathew opened this issue May 23, 2020 · 0 comments
Assignees

Comments

@adarshmathew
Copy link
Collaborator

SVO: SUBJECT-VERB-OBJECT TENSOR DATA consists of a large collection of triplet (subject, verb, direct object) extracted from Wikipedia, where each member of the triplet is a single word belonging to the WordNet lexicon (http://wordnet.princeton.edu):a noun for subject or direct object and a verb for the last member. This data set can be seen as a 3-mode tensor depicting ternary relationships between nouns and verbs.

FB15-237: The FB15K dataset was introduced in Bordes et al., 2013. It is a subset of Freebase which contains about 14,951 entities with 1,345 different relations. This dataset was found to suffer from major test leakage through inverse relations and a large number of test triples can be obtained simply by inverting triples in the training set initially by Toutanova et al.. To create a dataset without this property, Toutanova et al. introduced FB15k-237 – a subset of FB15k where inverse relations are removed.

The SVO dataset is less of a knowledge graph and more of a semantic/linguistic relationship graph derived from WordNet. Is it appropriate for our task?

@adarshmathew adarshmathew changed the title Battle of the Datasets: SVO, WikiData, FB15-237 Choice of Dataset: SVO, WikiData, FB15-237 May 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants