Division By Zero in def is_mispelling #39

gffde3 · 2018-01-13T13:53:12Z

Hey,

I've been using your lib on 0.0.1 and just updated recently (I had to hack some of the SQLite fts keywords and will fix that up again) but I've come across a problem:

You get a div zero error in tokencomparison.py -> def is_mispelling(self, token1, token2)

Here are the values of the vars in that function when it throws:

float division by zero
token1: 0
token2: 2
mis_t1: []
mis_t2: []
common: []

I know you're comparing distance for string tokens, but what is the logic behind numeric values? Whats the logic behind determining if two numbers are misspellings? (even ignoring the 0 value)

Even if you swap the max( ) / min ( ) to min ( ) / max ( ) and take the inverse you'll still get 0 for 0 values.

Maybe an absolute difference is better but that stuffs you up when there are addition errors (e.g. 1 typo to 10)

Maybe edit distance is still best used here?

As an aside, thanks for making this library; it's saved me some time so far :)

gffde3 · 2018-01-18T11:40:41Z

So I just set the exception for div 0 to return False. Seems to work alright.

lalalandau · 2018-01-24T22:44:40Z

I had this same issue but can't seem to replicate your fix. Do you mind posting the snippet of the is_mispelling function that you changed?

And thank you to you both, for making this package and working on this issue, as it would be a huge help.

jacobod · 2018-03-12T21:11:22Z

This bit seemed to work for me, though not sure if it is the most efficient:

        if (t1f == float(0)) | (t2f == float(0)):
            return False

        else:
            if max(t1f, t2f)/min(t1f, t2f) < self.number_fuzz_threshold:
                return True
            else:
                return False

junaidahmed361 · 2018-05-02T00:13:15Z

I'm also getting the ZeroDivisionError and can't seem to figure out how to forego it while still returning the correctly linked dataframe. I saw the earlier comment mentioned changing the exception for div 0 to return False, and I would also like to see a snippet of what and how to fix the issue. I've tried to implement the snippet above, but same issue persisted.

ghost · 2018-09-07T14:10:16Z

As pointed out by @gffde3, I added :

except ZeroDivisionError:
    pass

on line 40 of tokencomparison.py and it did the trick. 🎉

kennethzhu88 · 2018-10-03T07:54:18Z

As pointed out by @gffde3, I added :
except ZeroDivisionError:
    pass 
on line 40 of tokencomparison.py and it did the trick. 🎉

This work for me too, many thanks @gregobf

chris1610 · 2018-12-01T20:41:29Z

I think changing line 42 to this is a little cleaner than adding a whole new exception line:

except (ValueError, ZeroDivisionError):

Fix Zero Division Error as described in RobinL#39 and RobinL#42

Fix Zero Division Error as described in #39 and #42

RobinL · 2019-02-22T09:18:18Z

Closed by #43

RobinL · 2019-02-22T09:25:24Z

Thanks @chris1610 and those for reporting

7cb15 · 2019-03-27T13:52:45Z

I am still getting this error despite the update to tokencomparison.py (error is a ZeroDivision error on line 40 as noted above). Note, I pip installed the package so perhaps that is the issue. Any help is much appreciated!

ghost · 2019-04-23T14:59:46Z

Same here, I used regular pip install and pulled from GitHub.

kanlancb · 2022-04-26T08:25:23Z

Same here, in colab through pip

chris1610 added a commit to chris1610/fuzzymatcher that referenced this issue Dec 1, 2018

Update tokencomparison.py

f561c63

Fix Zero Division Error as described in RobinL#39 and RobinL#42

chris1610 mentioned this issue Dec 1, 2018

Update tokencomparison.py #43

Merged

RobinL pushed a commit that referenced this issue Feb 22, 2019

Update tokencomparison.py

c5ce4a7

Fix Zero Division Error as described in #39 and #42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Division By Zero in def is_mispelling #39

Division By Zero in def is_mispelling #39

gffde3 commented Jan 13, 2018

gffde3 commented Jan 18, 2018

lalalandau commented Jan 24, 2018

jacobod commented Mar 12, 2018

junaidahmed361 commented May 2, 2018 •

edited

ghost commented Sep 7, 2018

kennethzhu88 commented Oct 3, 2018

chris1610 commented Dec 1, 2018 •

edited

RobinL commented Feb 22, 2019

RobinL commented Feb 22, 2019

7cb15 commented Mar 27, 2019 •

edited

ghost commented Apr 23, 2019

kanlancb commented Apr 26, 2022

Division By Zero in def is_mispelling #39

Division By Zero in def is_mispelling #39

Comments

gffde3 commented Jan 13, 2018

gffde3 commented Jan 18, 2018

lalalandau commented Jan 24, 2018

jacobod commented Mar 12, 2018

junaidahmed361 commented May 2, 2018 • edited

ghost commented Sep 7, 2018

kennethzhu88 commented Oct 3, 2018

chris1610 commented Dec 1, 2018 • edited

RobinL commented Feb 22, 2019

RobinL commented Feb 22, 2019

7cb15 commented Mar 27, 2019 • edited

ghost commented Apr 23, 2019

kanlancb commented Apr 26, 2022

junaidahmed361 commented May 2, 2018 •

edited

chris1610 commented Dec 1, 2018 •

edited

7cb15 commented Mar 27, 2019 •

edited