Skip Gram with Negative Sampling #193

andland · 2017-06-14T19:43:43Z

https://arxiv.org/pdf/1705.09755v1.pdf

I recently posted a paper to arXiv showing that word2vec's Skip Gram with Negative Sampling (SGNS) algorithm is a weighted logistic PCA. With that framework, SGNS can be trained using the same term-context matrix that is used for GloVe. The training could use the same AdaGrad procedure, only with different gradients and loss function and sampling all of the elements of the matrix instead of just the non-zeroes.

Is SGNS something you are interested in including in the text2vec package, or are you happy with GloVe?

Thanks

dselivanov · 2017-06-14T20:29:50Z

Thanks! Article looks very interesting. From my experience sgns and glove usually perform very similar. But would be interesting to compare in more detailed way.

andland · 2017-06-14T22:04:39Z

I agree they are largely similar, but an advantage of SGNS is that it does better for rarely occurring words. As the Swivel paper puts it:
"GloVe is under-constrained: there is no penalty for placing unobserved but unrelated embeddings near to one another."

dselivanov · 2017-06-15T05:25:50Z

Yes, I remember this. But the clear advantage of GloVe is that complexity is O(nnz) instead of O(D^2). As I understand proposed SGNS and SGNS-LS also suffer from having complexity O(D^2).

andland · 2017-06-15T13:07:00Z

That is a downside. However, my intuition is that the number of parameter updates is more relevant than than number of epochs. For example, Figure 5 of the BPR paper. i.e. the algorithm may converge in a similar number of parameter updates as GloVe. This is mostly speculation though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip Gram with Negative Sampling #193

Skip Gram with Negative Sampling #193

andland commented Jun 14, 2017

dselivanov commented Jun 14, 2017

andland commented Jun 14, 2017

dselivanov commented Jun 15, 2017 •

edited

andland commented Jun 15, 2017

Skip Gram with Negative Sampling #193

Skip Gram with Negative Sampling #193

Comments

andland commented Jun 14, 2017

dselivanov commented Jun 14, 2017

andland commented Jun 14, 2017

dselivanov commented Jun 15, 2017 • edited

andland commented Jun 15, 2017

dselivanov commented Jun 15, 2017 •

edited