Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Zero Division Error Issue #57

Open
StephenCranney opened this issue Jul 10, 2020 · 2 comments
Open

New Zero Division Error Issue #57

StephenCranney opened this issue Jul 10, 2020 · 2 comments

Comments

@StephenCranney
Copy link

Several months ago I was often using Fuzzymatcher and never ran into a problem. However, now when I (and a colleague on a completely different OS) try to fuzzymatch two data frames I'm getting a ZeroDivisionError (below). I've read through the responses to a similar problem a few years ago (issue 42 below), and they suggest incorporating a "except (ValueError, ZeroDivisionError)" into line 44 of the tokencomparison.py file. The problem is that file already has the ZeroDivisionError built in, which suggests that this ZeroDivisionError is a new one with a different provenance. Any suggestions for how to fix this without breaking the package would be helpful.

fuzzymatched=fuzzymatcher.fuzzy_left_join(dataframe1, dataframe2, 'address', 'address')
Traceback (most recent call last):

File "", line 1, in
fuzzymatched=fuzzymatcher.fuzzy_left_join(dataframe1, dataframe2, 'address', 'address')

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/init.py", line 41, in fuzzy_left_join
m.match_all()

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/matcher.py", line 92, in match_all
self.link_table = self._match_processed_data()

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/matcher.py", line 136, in _match_processed_data
this_record.find_and_score_potential_matches()

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/record.py", line 76, in find_and_score_potential_matches
self.matcher.data_getter.get_potential_match_ids_from_record(self)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/data_getter_sqlite.py", line 93, in get_potential_match_ids_from_record
self._search_specific_to_general_single(token_list, rec_left)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/data_getter_sqlite.py", line 119, in _search_specific_to_general_single
self._add_matches_to_potential_matches(new_matches, rec_left)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/data_getter_sqlite.py", line 167, in _add_matches_to_potential_matches
scored_potential_match = self.matcher.scorer.score_match(rec_left.record_id, right_id)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/scorer_default.py", line 57, in score_match
p = self._field_to_prob(f_left, record_left, record_right)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/scorer_default.py", line 78, in _field_to_prob
prob_unmatching2 = self._get_prob_unmatching(unmatching_tokens_right, tokens_left, field_right, field_left)

File "/Users/stephencranney/opt/anaconda3/lib/python3.7/site-packages/fuzzymatcher/scorer_default.py", line 107, in _get_prob_unmatching
return 1/prob

ZeroDivisionError: float division by zero

@Mike-Honey
Copy link

After an initially promising start with this package, as I widened my test data I quickly ran into the same bug.

Unfortunately I'm under time pressure so I'll have to move on and try other packages, but I'm also keen to hear of any possible fix.

@sleeprock
Copy link

Ran into this error today.
Any options how to fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants