This repository is for collecting similarity computation related ideas and the realization of those ideas.
- WMD: From Word Embeddings To Document Distances: 基于word2vec embedding 和EMD(Earth Mover's Distance)提出了一种新的计算文档距离的算法WMD(Word Mover's Distance)。旨在解决 Obama speaks to the media in Illinois 和 The President greets the press in Chicago 仅仅因词语拼写不同而导致距离很远的不合理现象。尽管这两个句子对应的词语在语义上是相近的。