Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numpy.isclose vs math.isclose #10161

Open
gasparka opened this issue Dec 5, 2017 · 118 comments
Open

numpy.isclose vs math.isclose #10161

gasparka opened this issue Dec 5, 2017 · 118 comments
Labels
57 - Close? Issues which may be closable unless discussion continued

Comments

@gasparka
Copy link

gasparka commented Dec 5, 2017

numpy.isclose (https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.isclose.html):

abs(a - b) <= (atol + rtol * abs(b))

math.isclose (https://docs.python.org/3/library/math.html#math.isclose):

abs(a - b) <= max(rtol * max(abs(a), abs(b)), atol)

Note that Numpy's equation is not symmetric and correlates the atol and rtol parameters, both are bad things (IMO).

Here is a situation where Numpy "incorrectly" flags two numbers as equal:

a = 0.142253
b = 0.142219
rtol = 1e-4
atol = 2e-5

# true because atol interferes with the rtol measurement
abs(a - b) <= (atol + rtol * abs(b))
Out[24]: True

# correct result, rtol fails
abs(a - b) <= max(rtol * max(abs(a), abs(b)), atol)
Out[29]: False

Here is another one, this case symmetry problem:

a = 0.142253
b = 0.142219
rtol = 1e-4
atol = 1.9776e-05

# relative to b
abs(a - b) <= (atol + rtol * abs(b))
Out[73]: False

#relative to a
abs(a - b) <= (atol + rtol * abs(a))
Out[74]: True

# math one has no problems with this
abs(a - b) <= max(rtol * max(abs(a), abs(b)), atol)
Out[75]: False

Python math version looks to be bulletproof, should Numpy start using this? Are there any benefits of using the Numpy version?

@pv
Copy link
Member

pv commented Dec 5, 2017 via email

@bashtage
Copy link
Contributor

bashtage commented Dec 5, 2017

This is one of the most widely used function in all of numpy (through assert_allclose) The only path for this would be to choose another name.

@charris
Copy link
Member

charris commented Dec 5, 2017

I confess that the current implementation looks like a bug. Note that using max(abs(a), abs(b)) instead of abs(b) will not break any existing tests, it merely relaxes the condition in the case that abs(a) > abs(b).

@eric-wieser
Copy link
Member

eric-wieser commented Dec 5, 2017

At the very least, this needs a warning in the docstring that it mismatches the builtin

@ghost
Copy link

ghost commented Dec 5, 2017

How about adding this?

from numpy.__future__ import isclose

Incidentally, that may be an idea for the random number versioning...

@ghost
Copy link

ghost commented Dec 5, 2017

For clarification, the import would not actually import isclose but would enable the new semantics. It would be scooped to a module.

The idea is that we can allow people to use the correct function and at the same time, we don't need to break any existing code.

@eric-wieser
Copy link
Member

eric-wieser commented Dec 5, 2017

, the import would not actually import isclose but would enable the new semantics

I don't think making this not import is possible. Just pick a longer name, like mathlike_ufuncs, which could cover remainder as well.

Note that from __future__ import print_function actually does produce a print_function global

@eric-wieser
Copy link
Member

Actually, the from __future__ import * style of import might not be appropriate here - in python, it affects syntax at a per-file level, and doing something similar in numpy would be tricky

@ghost
Copy link

ghost commented Dec 5, 2017

The iterator exceptions were not a syntax change. We just need a small C module to inspect the namespace of the calling module. We can even cache the modules so that only import time is slowed, and only trivially.

@eric-wieser
Copy link
Member

eric-wieser commented Dec 5, 2017

You're right - and true_division is obviously analogous to isclose.

This is actually a nice way to avoid the problems in #9444, as an alternative to use with statements for managing deprecated features.

@njsmith
Copy link
Member

njsmith commented Dec 6, 2017

Let's CC @ChrisBarker-NOAA as the creator of math.isclose.

This is a pretty minor issue IMO. Generally atol and rtol are chosen by fiddling with them until tests pass, and the goal is to catch errors that are an order of magnitude larger than the tolerances. Maybe it would make sense to relax the rtol part like @charris suggested, but I really don't think it's worth pulling out stack introspection tricks for this tiny tweak. And we'd still have the problem that isclose is often called indirectly by things like assert_allclose or various third-party testing helpers.

@shoyer
Copy link
Member

shoyer commented Dec 6, 2017

I asked on StackOverflow about using __future__ style imports: https://stackoverflow.com/questions/29905278/using-future-style-imports-for-module-specific-features-in-python

TLDR: it's possible, but not easy or clean.

@ghost
Copy link

ghost commented Dec 6, 2017

The stack introspection is not just for this, but is proposed as a general policy for changing the return values of functions. The question is: in principle, should this be changed? I do think that if the answer is yes, the deprecation period needs to be at least a few years given the widespread use. Python never included a past module but that's one option to try to improve API stability, if there is demand.

@bashtage
Copy link
Contributor

bashtage commented Dec 6, 2017

The other method would be to just add this as an option isclose(...,symmetric=True) or assert_allclose(symmetric=True) where the default would be False, the current situation.

@gasparka
Copy link
Author

gasparka commented Dec 6, 2017

I agree that this is a minor problem when you dont put a meaning to the rtol and atol values, just tune them to pass unit tests, as mentioned by @njsmith.

However, sometimes you would like to say that the error is within, for example, 1% (rtol=0.01).
In this case worst case error regarding rtol measurement is 100% (rtol * abs(b) gets close to atol, then atol + rtol * abs(b) ~= 2 * rtol * abs(b)).

Meaning that some values can pass with ~2% error:

atol = 1e-8 #default
rtol = 0.01 # 1%

b = 1e-6
a = 1.0199e-6 # ~ 2% larger comapared to b
abs(a - b) <= (atol + rtol * abs(b))
True

Implementing the idea of @charris, would make this particular case very slightly worse, as it even more relaxes the comparison, but still worth it as it gets rid of the symmetry problem and indeed is backward compatible.

IMO it would be better if Numpy used the Math function, but i understand that the change may be too disruptive and possibly not important for most users. Making the option to change between the isclose core would be useful.

@ChrisBarker-NOAA
Copy link
Contributor

@njsmith: thanks for bringing me in.

A bit of history: when I proposed isclose for the stdlib, we certainly looked at numpy as prior art. If it was just me, I may have used a compatible approach, for the sake of, well compatibility :-).

But the rest of the community thought is was more important to do what was "right" for Python, so a long discussion ensued... I tried to capture most of the point in the PEP, if you want to go look.

Here is the reference to numpy:

https://www.python.org/dev/peps/pep-0485/#numpy-isclose

You can see that the same points were made as in this discussion.

In the end, three key points came out:

  1. a symmetric approach would result in the least surprise.

  2. the default absolute tolerance should probably be zero, so as not to make any assumptions about the order of magnitude of the arguments.

  3. the difference between the "weak" and "strong" tests was irrelevant when used with small tolerances, as is the the expected use case.

In the end, I think we came up with the "best" solution for math.isclose.

But is it "better" enough to break backward compatibility? I don't think so.

Unfortunately, much of numpy (and python) had many features added because they were useful, but without a lot of the current discussion these things get, so we have a lot of sub-optimal designs. This is expected (and necessary) for a young library, and we just have to live with it now.

@njsmith is right -- I think very few uses of np.isclose() have the tolerances set by rigorous FP error analysis, but rather, trial an error, with np.isclose() itself.

However, I think the bigger problem is the default atol -- it assumes that your arguments are of order 1 -- which could be a VERY wrong assumption, and since it would often result in tests passing that shouldn't, users might not notice.

In [85]: np.isclose(9.0e-9, 1.0e-9)
Out[85]: True

ouch!

and yes, it's the atol causing this:

In [86]: np.isclose(9.0e-9, 1.0e-9, atol=0.0)
Out[86]: False

So it probably is a Good Idea to have a path forward.

I like the keyword argument idea -- seems a lot more straightforward than trying to mess with future and the like.

And we could then decide if we wanted to start issuing deprecation warnings and ultimately change the default many versions downstream...

@ChrisBarker-NOAA
Copy link
Contributor

I'd like to propose @bashtage's suggestion:

"""
The other method would be to just add this as an option isclose(...,symmetric=True) or assert_allclose(symmetric=True) where the default would be False, the current situation.
"""

I think it's a better option than a new function name, and could lead the way to future deprecation and change of the default (or not).

I think in addition to the (strong) symmetric test, atol should default to zero as well.

Given that, maybe we need a better name for the parameter than symmetric, though it's not bad...

@ChrisBarker-NOAA
Copy link
Contributor

Oh -- and it might be good to look to make sure that isclose() isn't called elsewhere in numpy in addition to assert allclose().

@ghost
Copy link

ghost commented Dec 9, 2017 via email

@eric-wieser
Copy link
Member

Do we need to add it to assert_allclose too, or do we just make the change silently? Making people update every one of their tests to silence a warning is not really any different to making them update their tests to fix the tolerance reducing

@rgommers
Copy link
Member

rgommers commented Dec 9, 2017

@xoviat @eric-wieser there cannot be any backwards incompatible changes to assert_allclose nor a deprecation warning from it, way too disruptive. Unless I'm missing where you want to put a deprecation warning?

@njsmith
Copy link
Member

njsmith commented Dec 9, 2017 via email

@ghost
Copy link

ghost commented Dec 9, 2017 via email

@ghost
Copy link

ghost commented Dec 9, 2017 via email

@rgommers
Copy link
Member

rgommers commented Dec 9, 2017

IMO silent changes are nasty!

Agreed. However, there should be zero changes. Deprecation warnings in widely used functions also force users (at least the ones that actually do maintenance) to make changes.

@gasparka
Copy link
Author

gasparka commented Dec 9, 2017

What about fixing the non-symmetry issue like @charris suggested? This would also free up some documentation space, that could be filled with a warning about the bad things that may happen when atol != 0.0.

@ghost
Copy link

ghost commented Dec 9, 2017 via email

@ghost
Copy link

ghost commented Dec 9, 2017 via email

@rgommers
Copy link
Member

rgommers commented Dec 9, 2017

What about fixing the non-symmetry issue like @charris suggested?

That suggestion (#10161 (comment)) seems feasible.

We can't fix that issue without letting people know about the changed behavior.

We're not fixing it. This is the thing about cost/benefit of any change that is just being discussed in the semantic versioning issue. This is definitely a case where we will not make any change or issue any warning from a widely used testing function. Note that the current behavior is not a bug (although arguably a suboptimal choice).

@ghost
Copy link

ghost commented Dec 9, 2017 via email

@mattip
Copy link
Member

mattip commented Jun 2, 2021

Anyone have an idea for a smooth way to make a transition like this?

I think this is exactly why we have a separate namespace for the NEP 47 compatibility layer. Once we offer the alternatives there, we can slowly try to move those behaviors into the main NumPy namespace after an appropriate deprecation cycle.

@shoyer
Copy link
Member

shoyer commented Jun 2, 2021

So how DO we transition numpy to newer, better methods??

Well, we have Ralf's NEP 23: https://numpy.org/neps/nep-0023-backwards-compatibility.html

I think the case of np.histogram is probably the best precedent here.

If we were going to do this, the way to do it would be with new function name or namespace. This lets users migrate on their own time without needing to change code twice.

@mhvk
Copy link
Contributor

mhvk commented Jun 2, 2021

It seems there are really two issues in the revival of this thread: what to pick for the API standard, and whether to adjust numpy's implementation. To me, the former seems fairly easy: pick the better stdlib implementation (to me, actually better mostly because of a default atol=0).

Changing numpy's implementation is much trickier - I might wish to ignore the breakage myself, but really that's unfair to many users and I think NEP 23 is pretty clear that the process should be long and torturous.

But the route of a different namespace for the array API seems a very good compromise - indeed, great to have a way to slowly move to a smaller, better defined API!

@ChrisBarker-NOAA
Copy link
Contributor

ChrisBarker-NOAA commented Jun 2, 2021 via email

@miccoli
Copy link
Contributor

miccoli commented Jun 7, 2021

Thinking about it more, the argument for the numpy version is that it checks closeness to a wanted value, whereas the python version has no preference between the two values. One can argue for both, isclose vs areclose :) The first might be better for testing, the second for arbitrary pairs of numbers.

My two cents.

One should also not forget that math.isclose compares two scalars, so it is natural to expect that math.isclose(a,b) == math.isclose(b,a). But np.isclose(a,b) compares elementwise two arrays, so it is preferable for me that the elementwise tolerances are computed using a single array, namely b. On the contrary it would be quite disorienting that a portion of the elementwise tolerances would be computed using a and the remaining using b for the sake of symmetry only.

Edge cases in which np.isclose(a,b) !=np.isclose(b,a) are not a big issue; on the contrary, if b is a reference value, I would like to avoid that the elementwise tolerances could be relaxed basing on the values of a.

By the way the current docs are very well written, so I would not assume that a symmetric isclose should be considered superior. However I agree that atol=0 is a better default and that summing the absolute and relative tolerance could be replaced by chosing the maximum of the two.

@ChrisBarker-NOAA
Copy link
Contributor

ChrisBarker-NOAA commented Jun 7, 2021 via email

JoepVanlier added a commit to lumicks/pylake that referenced this issue Jun 8, 2021
`Numpy` uses a small absolute tolerance as the default for checking closeness (probably to make checks against zero possible). See: https://numpy.org/doc/stable/reference/generated/numpy.isclose.html?highlight=isclose#numpy.isclose

However, this can be problematic when values being compared are very small (as big relative differences will still fall under the tolerance). The `isclose` in `math` actually uses the safer zero absolute tolerance as default:
https://docs.python.org/3/library/math.html#math.isclose

Interesting discussion here: numpy/numpy#10161
JoepVanlier added a commit to lumicks/pylake that referenced this issue Jun 10, 2021
`Numpy` uses a small absolute tolerance as the default for checking closeness (probably to make checks against zero possible). See: https://numpy.org/doc/stable/reference/generated/numpy.isclose.html?highlight=isclose#numpy.isclose

However, this can be problematic when values being compared are very small (as big relative differences will still fall under the tolerance). The `isclose` in `math` actually uses the safer zero absolute tolerance as default:
https://docs.python.org/3/library/math.html#math.isclose

Interesting discussion here: numpy/numpy#10161
@OverLordGoldDragon
Copy link
Contributor

Very informative thread.

Was consensus ever reached on handling float32 and float16? i.e. their closest atol and rtol equivalents for current defaults? I always found it strange how NumPy defaulted independent of precision - atol=1e-8 doesn't fly below float64.

@rgommers
Copy link
Member

Was consensus ever reached on handling float32 and float16?

The proposal from @pmeier that was the last comment on this topic seems very reasonable. This point is a bit moot unless we're actually going to be able to make a chance via introducing a new function though, which is where this topic is stuck on right now. The benefits don't seem large enough that that is a compelling prospect - or at least no one is putting in the large amount of effort required for that.

@OverLordGoldDragon
Copy link
Contributor

I think the following point by @ChrisBarker-NOAA isn't given remotely enough attention - that all input values are on order of unity:

np.allclose(1e-9, 2e-9) == True

I do extensive numeric testing and discrete optimizations in signal processing applications, and I trusted NumPy to get this right. Filters typically rapidly decay in time and frequency domains, and that's with them being unit-normed (L1/L2, not value-by-value) - the difference between 1e-9 and 2e-9 is huge, yet it's invisible to NumPy.

At bare minimum, this should be mentioned in allclose docs. At high priority, I think "the standard" for NumPy float equality should move closer to math.isclose - in all contexts I work with, I couldn't care less for the added compute, and accuracy is a must. I concur with back-compatibility, so the fast allclose can stay for those who want it.

However, I see numpy.isclose does have a warning concerning the unity order. Besides that, it's just a slightly different docstring, yet the underlying formulae and default tolerances are exactly the same?

@rgommers
Copy link
Member

I think the following point by @ChrisBarker-NOAA isn't given remotely enough attention

No one disagrees that this is bad. It's not about attention here, it's about doing the work needed to introduce a suitable replacement (and dealing with the high level of bike shedding that a new name is inevitably is going to attract - this is a lot more effort than you'd expect, and not much fun).

At bare minimum, this should be mentioned in allclose docs.

Definitely. Would you be willing to open a PR to improve this?

@OverLordGoldDragon
Copy link
Contributor

Would you be willing to open a PR to improve this?

Yes. Can you clarify isclose vs allclose? Seems the PR is as simple as pasting former's note into latter, unless the two are more different than they appear.

@rgommers
Copy link
Member

Yep, that looks like all that's needed there. The implementation of allclose is the one-liner all(isclose(...)), so they behave identically.

@OverLordGoldDragon
Copy link
Contributor

OverLordGoldDragon commented Aug 12, 2023

What's the issue with def allclose(, dtype='float64')? Won't break any old code and will function identically.

I'm currently looking into rtol & atol for lower precisions, if we can't inject it into allclose then I'll look into making a separate method.

OverLordGoldDragon added a commit to OverLordGoldDragon/numpy that referenced this issue Aug 13, 2023
See numpy#10161 (latest comments, ~8/13/2023)

1: Add `isclose`'s warning to `allclose`

2: Revise `isclose`'s "Notes" on `atol` to emphasize its poorness and better
explain its purpose

3: Add revised `isclose` note to `allclose`
@OverLordGoldDragon
Copy link
Contributor

I'm actually not at all happy with NumPy's default atol=1e-8 with "assuming on order of 1". I've done a lot of signal processing work and have a strong sense of what qualifies as "zero" for data on order of 1 - it's at most 1e-15 for float64. 1e-8 is good if you're trying to test if an individual array - a set of measurements - is close to zero, rather than for comparing two arrays which are supposed to be float-equal. 1e-8 is simply ignoring all distances below 1e-8, and there can be a ton such values (see below).

The current heuristic I'm thinking of is atol = median(abs(x))) / 1e15 - and just settle on one of a or b since if they're "equal" it won't matter. Or safer but more compute, median(abs(a) + abs(b)) / 2e15 or median(abs(a[a!=0])). Either way 1e-15 appears superior, and is much closer to atol=0.

It's too late to change atol=1e-8 but not too late to simply call it a mistake in docs and advise another value. I propose the current

The default atol is not appropriate for comparing numbers that are much smaller than one (see Notes).

be changed to

The default atol is too large for comparing numbers that are on order of one (see Notes). 1e-15 is advised in that case.

but I'll do that in a separate PR.

1e-8 problem example

Arguably, vast majority cases of allclose seek to check whether two input-output pipelines produced the same output, rather than two sensors having gathered same data. For this, fft is an excellent prototype as it involves much addition and multiplication - but also explicitly test addition and multiplication via fractional exponentiation.

import numpy as np
from numpy.fft import fft, ifft

np.random.seed(0)
x = np.random.randn(10000)*np.sqrt(2)  # `mean(abs(x)) ~= 1`
x[:1000] = 0
y = abs(ifft(fft(x))**1.15)**(1/1.15) * np.sign(x)
y = ifft(fft(y))

print(sum(abs(x - y) < 1e-8), 
      np.allclose(x, y, rtol=1e-99),
      np.allclose(x, y, atol=1e-15))  # if 1e-15 fails, so does 0
10000 True False

So rtol is totally ignored with default atol, but not proposed atol. My simpler "heuristic" in this case yields atol=8.2e-16.

Note, all(abs(x - y) < 1e-8) is much easier to satisfy than per above - it's not about "allclose passes or not", it's about meeting intent. The intent is that atol is reserved for zeros, and rtol for all else. This intent completely fails for vast majority of cases - above computation is fairly float-intensive, and if it fails, something simpler of course also will (eg x + np.pi - np.pi).

@OverLordGoldDragon
Copy link
Contributor

OverLordGoldDragon commented Aug 13, 2023

Oh. I see now that the warning is tailored to isclose, which deals with individual inputs (scalars) - no issue there. Then only allclose needs rewording, as there's room for misinterpreting "on order of 1" as meaning mean(abs(x)) ~= 1, or another aggregate measure of "size" rather than element-wise. The 1e-15 recommendation will only apply to allclose. Yet this too I need to rethink, as I may have confused what I'm pointing out to be confusion - comparing array vs array and array vs scalar.

Nope, 1e-15 is right, I confused my confusion, sorry about that.

@pmeier
Copy link

pmeier commented Aug 14, 2023

@OverLordGoldDragon

What's the issue with def allclose(, dtype='float64')? Won't break any old code and will function identically.

IIUC, you want to have a dtype parameter that use the value to determine the automatic tolerances in case nothing is specified? If so, I wouldn't do it like that:

  1. The dtype parameter means something different in most cases. This could easily lead to confusion.
  2. One can get the dtype from the input arguments.

My suggestion would be something along the lines of

def isclose(a, b, rtol=None, atol=None, ..., dtype_specific_tolerances=False):
    if dtype_specific_tolerances and rtol is None or atol is None:
        dtype = np.promote_types(a.dtype, b.dtype)
        rtol, atol = get_auto_tolerances(dtype)
    else:
        if rtol is None:
            rtol = 1e-5
        elif atol is None:
            atol = 1e-8

    ...

with get_auto_tolerances(dtype) implementing something like #10161 (comment). This change would be fully BC and you could set dtype_specific_tolerances=True to get the automatic behavior.

@OverLordGoldDragon
Copy link
Contributor

OverLordGoldDragon commented Aug 14, 2023

@pmeier

The dtype parameter means something different in most cases. This could easily lead to confusion.

I don't follow. I use NumPy a lot and I've never seen it mean anything else. What do you have in mind?

One can get the dtype from the input arguments.

Right, that's what I had in mind once I'd actually PR. However, I disagree with np.promote_types(a.dtype, b.dtype) - 32 & 64 -> 64, while one of inputs lacks the information necessary to pass 64's rtol & atol. We'd demote to lowest instead. If isclose-etc only handles 64, first demote both inputs then re-promote, information's same. I'd also prefer a shorter name for the boolean.

@pmeier
Copy link

pmeier commented Aug 14, 2023

I don't follow. I use NumPy a lot and I've never seen it mean anything else. What do you have in mind?

Usually dtype is used in creation and conversion ops. And there it means "give me the result with this dtype". Here you want to use dtype as a selector for a different parameter.

However, I disagree with np.promote_types(a.dtype, b.dtype) - 32 & 64 -> 64, while one of inputs lacks the information necessary to pass 64's rtol & atol. We'd demote to lowest instead.

I'm sorry, but I don't follow. Could you compile an example where you think this would be a problem?

The idea here is to go with the more stricter tolerances by default in case the dtypes mismatch to avoid silent errors. Meaning, if you put in float32 and float64, by default it will be compared as if we had two float64. If the user didn't think about it, this operation might fail, but then they have instant feedback to change it. Either convert the float64 to float32 as well or set the tolerances manually. The other way around, we could get silent passes, which can be much worse in a testing setting.

@OverLordGoldDragon
Copy link
Contributor

Ok, agreed on both, thanks for explaining.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
57 - Close? Issues which may be closable unless discussion continued
Projects
None yet
Development

No branches or pull requests