Skip to content
Seppala edited this page Jul 11, 2011 · 2 revisions

##1. Clustering related words from wikipedia

(Hack/Reduce 3 Boston)

http://code.google.com/p/wiki-graph/

Satish Gopalakrishnan, Vineet Manohar

Satish and Vineet wanted to create an application that would find a list of associated words for any chosen word. They created a distance algorithm that ranked words based on how close to the original word they were mentioned in wikipedia articles. To get the results, they scanned through the wikipedia dataset and looked for the associated words for “McCain”, “Erlang” and “Reebok”.