Understanding link prediction for DistMult/ComplEx #2015
Unanswered
davidshumway
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Given a train and test set of nodes/relations, e.g.
train:
test:
and training:
with node embedding:
(
n_components=2, perplexity=50, learning_rate='auto', n_iter=250, n_iter_without_progress=50, random_state=99, angle=0.6, n_jobs=-1
)then the output of DistMult is mrr and hits@10 for raw and filtered, e.g.
and as well there is a list of
raws
andfiltereds
for each SRO in test resulting fromrank_edges_against_all_nodes
, e.g.filtereds
:with
[... [object_rank subject_rank] ...]
for each SRO in test?Something like this:
? For
filtereds
, this ranking is against negatively generated triples?I'd like to further understand: why do some relations appear to be predicted very well while others are very poorly predicted; why in some cases do objects appear to be better predicted than subjects, and vice versa in other cases. But a little unsure where/how to start exploring in the generated embedding. Any tips to doing so?
Going further, it appears there is an embedding for every triple in the dataset. Why is this? This is opposed to what I would think would occur, which is to have an embedding for every node in the graph.
Using the above example, assuming there are 300,000 nodes in the graph, and example triples for each node are as follows:
then generating the node embedding:
then
len(node_embeddings)
is900,000
(300,000 nodes x 3 relations per node) rather than300,000
(one embedding per node).Beta Was this translation helpful? Give feedback.
All reactions