Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update #2

Open
wants to merge 118 commits into
base: master
Choose a base branch
from
Open

update #2

wants to merge 118 commits into from

Conversation

alicanozer
Copy link
Owner

No description provided.

Emmett Underhill and others added 30 commits August 22, 2016 16:09
MetricLCS and NormalizedLevenshtein both divide by the max string
length to produce distances. If two empty strings are used then
a division by zero occurs and NaN is returned.
Prevent divide by zero errors.
DL Optimal String Alignment implementation
Return ngram distance instead of similarity
Performance: avoid spurious containsKey calls
ewanmellor and others added 30 commits June 21, 2018 12:07
Add a limit parameter to Levenshtein and WeightedLevenshtein's distance
methods.  This causes the calculation to exit early if the limit is reached.
This means that if the caller only cares about strings with a small distance,
they can terminate early if the strings are found to be very different.
Add a limit parameter to the {Weighted,}Levenshtein distance.
Fix issue #53 
Many thanks to @paulirwin  for the thorough issue analysis!
Added Ratcliff-Obershelp implementation, ported from .Net code by Ligi (https://github.com/dxpux)
Clean up the code and have it pass the check style.
Test unit for Ratcliff-Obershelp algorithm
Added test data from various sources.
Fixed diamond operator to comply with Java 1.6
Cosmetic edit
Implementation of Ratcliff-Obershelp algorithm
Add regular (non-null/empty) Cosine Test Cases.
The previous readme made it same like Jaro-Winkler was the ideal typo detector,
when in actuality it is really only suited for typos caused by unsynchronized
high-speed typing between between both hands but does not account for actual
miskey errors such as hitting the wrong key altogether or advertently pressing
two keys instead of one. This is because Jaro-Winkler operates only on
transpositions and does not favorbly consider a string consisting strictly of
additions or permutitions with letters not already part of the word's alphabet
to be "similar" changes.
Update Jaro-Winkler description in README
thanks @mqudsi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet