📚 Word shingling for near duplicate document detection
-
Updated
Jun 2, 2017 - OCaml
📚 Word shingling for near duplicate document detection
Data Mining Projects 2017
First story detection using shingling, LSH and graphical methods
Testing Jaccard similarity and Cosine similarity techniques to calculate the similarity between two questions.
Using shingles/most used phrases in elasticsearch(v7) and Kibana graph
Golang shingles algorithm implementation for english, french, norwegian, russian, spanish and swedish
Implementation of LSH algorithm for jobs announcements in Kijiji website
Search engine for plagiarism over the internet. Based on Google API.
Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...
A public React Project to provide a sample of photos for GAF's Timberline HDZ Shingle Product Line. The app is organized by color and can be used on Mobile or Desktop.
Lucene token filter that removes trailing stopwords from shingles.
Rust min-shingle hashing implementation
Math 140 project, I only wrote the grammify method. My professor provided the rest of the code.
Plagiarism Detection System, designed to identify similarities between a given text and existing online content.
A .NET port of java-string-similarity
Add a description, image, and links to the shingles topic page so that developers can more easily learn about it.
To associate your repository with the shingles topic, visit your repo's landing page and select "manage topics."