Similarity between two documents.
-
Updated
Aug 6, 2022 - Python
Similarity between two documents.
Topic Modeling in Cython
was curious about how plagiarism checker works, ended up learning about something completely different 😂
Information Retrieval Lab
The framework that finds a perfect job match for you provided through scraped data from indeed.co.uk.
Individual group project in Python
Aims to provide job searching strategy for new graduates who are interested in data-related positions.
Compare sentences from input document with all sentences from reference documents - find very similar ones.
Assessing MinHash LSH for text similarity. Compares with kNN using BART embeddings as ground truth. Involves data preprocessing, shingle creation, LSH experiments. Findings inform LSH's efficiency in document similarity tasks, enhancing understanding of LSH techniques.
Q3 of Final Project Assignment of the course 'Foundations of Data Science' @ CBS
Document searching from queries using Inverted index
A simple MinHash implementation based on the explanation in the Mining of Massive Datasets course by Stanford
Classifying news articles with deep learning to build an automatic newsletter
Big data homework solutions
Use of word embeddings and document similarity to solve word analogy problems
This repository will demonstrate how to explore spiritual world using NLP techniques like, sentiment analysis, topic modeling, information retrieval and text summarization.
Survey data and Python code for the ICADL 2021 paper "A Qualitative Evaluation of User Preference for Link-based vs. Text-based Recommendations of Wikipedia Articles"
Given a set of documents and the minimum required similarity threshold find the number of document pairs that exceed the threshold
Add a description, image, and links to the document-similarity topic page so that developers can more easily learn about it.
To associate your repository with the document-similarity topic, visit your repo's landing page and select "manage topics."