CS224W: Added filtered evaluation setting to KGEModel #8644
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The original implementation of KGEModel simply calculates the rank of the gold tail for any given query by considering all entities in the dataset as the candidate tail set. However, most Knowledge Graph Completion literature Bordes et al, 2013 performs evaluation in a filtered setting. For any query (h,r,?), we first filter out all t from the candidate tail set such that (h,r,t) is present in the training, validation, or test split, with the exception of the gold answer to the query. The rank is then calculated on this filtered candidate set.
This PR introduces an option to turn on this mode of evaluation. If the 'filtered' argument is set to True, and a list of tails present in the training, validation, or test set for any (head,relation) pair is passed in the 'neighbors' argument, the evaluation is performed in the filtered setting. Specifically, if (h,r,t) is present in the dataset, t is present in neighbors[h][r].