An Information retrieval system using ranked retrieval coded from scratch in Python
-
Updated
May 22, 2020 - Jupyter Notebook
An Information retrieval system using ranked retrieval coded from scratch in Python
Hybrid RecSys, CF-based RecSys, Model-based RecSys, Content-based RecSys, Finding similar items using Jaccard similarity
TF-IDF scores and visualizations for documents produced over time
A Web based Domain Specific Search Engine in Python
First story detection using shingling, LSH and graphical methods
This was a HTML web scraping project with Python's libraries. The objective of the project was to extract user's comments in "mac power user" forum, cleanse data, tokenize text/comments, classify and store the words in datafrom.
Predicted geo-location of 80,000 tweets based on just its contents by finding Location Indicative Words and achieved 74% accuracy
Calculate the TF-IDF score using parallel algorithms
This is NLP based project, completed during FALL of 2020 for CSE 4022 - Natural Language Processing. Nepali Text Summarizer circulates on the idea of tf-idf and cosine similarity.
TF-IDF (Term frequency, Inverse Document Frequency) is an algorithm or way to score the importance of words (or 'terms') based on how frequently they appear
A complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki pages ordered by TF/IDF relevance based on given search word/s. From an optimized code to the K-Way mergesort algorithm, this project addresses latency, indexing, and big data challenges.
A crowdsourced search engine, which will return the specifics about professors depending on various types of search queries.
Exploring research ideas in usage of action requesting speech acts in video games. Might lead to a Masters thesis.
Add a description, image, and links to the tf-idf-score topic page so that developers can more easily learn about it.
To associate your repository with the tf-idf-score topic, visit your repo's landing page and select "manage topics."