Similarity Detection on Wikipedia Articles using MinHash and Random Projection implemented in Hadoop/Spark
-
Updated
Jul 19, 2018 - Java
Similarity Detection on Wikipedia Articles using MinHash and Random Projection implemented in Hadoop/Spark
algorithm of similarity detection about two string
A merge tree library
Quora Question Pair similarity
Detect duplicates between large number of articles and store only a single copy of each article.
A library implementing different string similarity and distance measures for ease of use. A dozen of algorithms (including Levenshtein edit distance and sibblings, Jaro-Winkler, Longest Common Subsequence, cosine similarity etc.) are currently implemented.
In this project we will find the best possible answer to the multiple choice question using tfidf embeddings and cosine similarity.
An efficient web-based Code Similarity Detection tool with report generator
Building a similarity based recommendation system
Data visualization of plants similarity, with d3.js. We analyze species that occur in the region of Virua (Amazon).
Charikar's Hash (shash Python Wrapper)
Website Comparison Tool
JavaFX application for detecting similarity between Python source code files using Levenshtein distance as a metric.
Detection of similar (redundant) questions in a question bank
CNN based Siamese network for similarity detection. A toy database is used to demonstrate its functionality.
News classification neural network model to classify a hierarchical news dataset. Algorithm to detect class overlap and similarities.
Add a description, image, and links to the similarity-detection topic page so that developers can more easily learn about it.
To associate your repository with the similarity-detection topic, visit your repo's landing page and select "manage topics."