Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

find_near_matches() uses substitution for distance calculation even if max_substitutions=0 is set. #41

Open
markussteindl opened this issue Jun 18, 2022 · 1 comment

Comments

@markussteindl
Copy link

Reproduction:

find_near_matches('Hello world', 'Hello babab', max_substitutions=0, max_l_dist=5)
# [Match(start=0, end=11, dist=5, matched='Hello babab')]

Is this intended behavior?
Without substitution, the distance should be 10 and not 5.
Thus the above call should not return any matches.

@taleinat
Copy link
Owner

Hey @Stonatus,

This is currently the intended behavior, yes. The "dist" attribute of matches describes the Levenstein / edit distance, which is indeed 5 in this case.

I can see that it could be useful to see the number of allowed changes needed given the input parameters. I'll leave this open while I consider if there's a neat way to implement this given the existing design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants