Skip to content
#

wikipedia-dump

Here are 72 public repositories matching this topic...

Implemented a search engine on the wikipedia dump of size 73.4 GB. In order to retrieve result faster and relevant, indexing and ranking is implemented. Relevance ranking algorithm is implemented using TF-IDF score to rank documents. Creating index takes around 14 hr on a given wikipedia dump. Result is retrieved in less than 1 second.

  • Updated Sep 12, 2019
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the wikipedia-dump topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the wikipedia-dump topic, visit your repo's landing page and select "manage topics."

Learn more