You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I calculated the cosine similarity between nodes 0,1 and 10. I'd expect similarity between 0 and 1 would be high and others would be low. However similarities are (-0.0070848754, 0.062368274, -0.13235472) for (0-1),(0-10),(1-10) pairs. Is it not reasonable to expect cosine similarities to be higher for close and interconnected nodes? If so, how can we measure the similarity and test the embeddings?
Thanks.
Here is my code:
graph_embedding_size = 100
def get_embedding(graph):
edges_ = pd.DataFrame({
'source': [e['source'] for e in graph['edges']],
'target': [e['target'] for e in graph['edges']],
'type': graph['edge_types']
})
G = StellarGraph(IndexedArray(index=graph['nodes']), edges_, edge_type_column="type")
walk_length = 10
rw = BiasedRandomWalk(G)
walks = rw.run(
nodes=G.nodes(), # root nodes
length=walk_length, # maximum length of a random walk
n=2, # number of random walks per root node
p=0.5, # Defines (unormalised) probability, 1/p, of returning to source node
q=2.0, # Defines (unormalised) probability, 1/q, for moving away from source node
weighted=False, # for weighted random walks
seed=42, # random seed fixed for reproducibility
)
model = Word2Vec(
walks, vector_size=graph_embedding_size, window=5, min_count=0, sg=1, workers=1
)
return model.wv.vectors
graph_ex = {'nodes': [0,1,2,3,4,5,6,7,8,9,10],
'edges': [
{ 'source': 0,'target': 1},
{'source': 0,'target': 2},
{'source': 0,'target': 3},
{'source': 1,'target': 2},
{'source': 1,'target': 3},
{'source': 2,'target': 3},
{'source': 2,'target': 4},
{'source': 4,'target': 5},
{'source': 4,'target': 6},
{'source': 5,'target': 6},
{'source': 6,'target': 7},
{'source': 5,'target': 8},
{'source': 5,'target': 9},
{'source': 9,'target': 10}
], 'edge_types': [1,1,1,1,1,1,1,1,1,1,1,1,1,1]}
embeddings_ex = get_embedding(graph_ex)
from numpy.linalg import norm
def cosine_sim(A,B):
return np.dot(A,B)/(norm(A)*norm(B))
A = embeddings_ex[0]
B = embeddings_ex[1]
C = embeddings_ex[10]
cosine_sim(A,B),cosine_sim(A,C),cosine_sim(B,C)
The text was updated successfully, but these errors were encountered:
I am trying to test how node2vec works with a small graph here : https://ibb.co/Y7JWVTV
I calculated the cosine similarity between nodes 0,1 and 10. I'd expect similarity between 0 and 1 would be high and others would be low. However similarities are (-0.0070848754, 0.062368274, -0.13235472) for (0-1),(0-10),(1-10) pairs. Is it not reasonable to expect cosine similarities to be higher for close and interconnected nodes? If so, how can we measure the similarity and test the embeddings?
Thanks.
Here is my code:
The text was updated successfully, but these errors were encountered: