Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sometimes, supress_warnings misses one of its attributes #8413

Closed
olebole opened this issue Dec 23, 2016 · 62 comments
Closed

Sometimes, supress_warnings misses one of its attributes #8413

olebole opened this issue Dec 23, 2016 · 62 comments

Comments

@olebole
Copy link

olebole commented Dec 23, 2016

When trying to compile skimage, I sometimes get the following error:

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/build/skimage-0.12.3/debian/tmp/usr/lib/python2.7/dist-packages/skimage/transform/tests/test_integral.py", line 46, in test_vectorized_integrate
    assert_equal(expected, integrate(s, r0, c0, r1, c1))  # test deprecated
  File "/build/skimage-0.12.3/debian/tmp/usr/lib/python2.7/dist-packages/skimage/transform/integral.py", line 86, in integrate
    warn("The syntax 'integrate(ii, r0, c0, r1, c1)' is "
  File "/build/skimage-0.12.3/debian/tmp/usr/lib/python2.7/dist-packages/skimage/_shared/_warnings.py", line 16, in warn
    warnings.warn(message, stacklevel=stacklevel)
  File "/usr/lib/python2.7/dist-packages/numpy/testing/utils.py", line 2199, in _showwarning
    self._orig_show(message, category, filename, lineno,
AttributeError: 'suppress_warnings' object has no attribute '_orig_show'

I assume this is a numpy problem, but I am not sure.

@seberg
Copy link
Member

seberg commented Dec 23, 2016

Hmmm, a bit odd. Is there some (strange) threading going on in the skimage tests? That would break warning testing pretty badly, since warning handling is not thread safe in python. I have a bit difficulty to see how else this could happen, but maybe should look at the exact code the test runs.

@seberg
Copy link
Member

seberg commented Dec 23, 2016

Since there is no assert_warns or so here I guess. You should only have a single suppress_warnings context going alive (which would be the outermost scope and created by numpy's test runner, assuming skimage ends up using it). Now why I am confused, is that _orig_show not being defined should only be possible if the context has already been exited. At that time warnings.showwarning should already be reset to the old value.

Of course the whole warnings stuff breaks down if you have threads. For example:

thread1: enters warning context -> replaces normal warning printing
thread2: enters warning context -> replaces thread1 warning handler
thread1: exits warning context -> resets to normal warning printing
thread2: exists warning context -> resets to thread1's warning handler -> kaboom.

Btw. I see you have a "try to clean up after __warning_registry__ stuff in skimage, the suppress warnings is a context manager that tries to solve a similar issue (and adds some other stuff), may or may not be interesting.

@olebole
Copy link
Author

olebole commented Dec 23, 2016

I just got the problem when I tried to build skimage for Debian, and I have no idea here. However, I opened scikit-image/scikit-image#2412 to have them involved.

@olebole
Copy link
Author

olebole commented Dec 23, 2016

Just for completeness: Sometimes, I even get a stacktrace without any involvement of skimage:

ERROR: test suite for <module 'skimage.transform.tests' from '/build/skimage-0.12.3/debian/tmp/usr/lib/python3/dist-packages/skimage/transform/tests/__init__.py'>
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/suite.py", line 229, in run
    self.tearDown()
  File "/usr/lib/python3/dist-packages/nose/suite.py", line 352, in tearDown
    self.teardownContext(ancestor)
  File "/usr/lib/python3/dist-packages/nose/suite.py", line 368, in teardownContext
    try_run(context, names)
  File "/usr/lib/python3/dist-packages/nose/util.py", line 453, in try_run
    inspect.getargspec(func)
  File "/usr/lib/python3.5/inspect.py", line 1040, in getargspec
    stacklevel=2)
  File "/usr/lib/python3/dist-packages/numpy/testing/utils.py", line 2199, in _showwarning
    self._orig_show(message, category, filename, lineno,
AttributeError: 'suppress_warnings' object has no attribute '_orig_show'

@charris
Copy link
Member

charris commented Dec 23, 2016

What versions of python, numpy, etc., are you compiling with?

@olebole
Copy link
Author

olebole commented Dec 23, 2016

Numpy 1.12 RC beta 1, Python 2.7 and 3.5
(sorry, my mistake: didn't check RC yet)

@seberg
Copy link
Member

seberg commented Dec 23, 2016

Hmm, I am confused.... I can't see why there would be threading problems, but also can't really fathom this error occuring without either a race condition or incorrect nesting of catch_warning like stuff (including suppress_warning).

@olebole
Copy link
Author

olebole commented Dec 23, 2016

I would also think it is a race condition, since it does not happen always. Running the build twice under exactly the same environment lets the problem pop up on different places (or even not at all).

@seberg
Copy link
Member

seberg commented Dec 23, 2016

@olebole how exactly are you running the test suit?

@olebole
Copy link
Author

olebole commented Dec 23, 2016

Copied from our test suite:

#!/bin/sh
set -efu

pys="$(pyversions -rv 2>/dev/null)"
pkgbuild=${pkgbuild:-no}

srcdir=$PWD

for py in $pys; do
    echo "=== python$py ==="
    if [ "$pkgbuild" = "yes" ]; then
        export PYTHONPATH="$srcdir/debian/tmp/usr/lib/python$py/dist-packages"
        cd "$srcdir/build/"
    else
        cd "$ADTTMP"
    fi

    xvfb-run -a python$py /usr/bin/nosetests -s -v --exclude test_tools.py skimage 2>&1
done

The xvfb-run is there since the test needs to be run in an X11 environment.

@seberg
Copy link
Member

seberg commented Dec 23, 2016

Hmmmpf, anyone knows what exactly happens under the hood when nose starts to test a module? Nose confuses me, and I can't see why it would end up also messing around with warnings.... Or it is a bug in suppress warnings after all, but can't really see it, heh.

@sciunto
Copy link

sciunto commented Dec 23, 2016

In my nosetests man page, I have

       --processes=NUM
              Spread  test run among this many processes. Set a number equal to the number of processors or cores in your machine for best results. Pass a negative
              number to have the number of processes automatically set to the number of cores. Passing 0 means to disable parallel testing.  Default  is  0  unless
              NOSE_PROCESSES is set. [NOSE_PROCESSES]

It seems that there is no parallel testing by default. At least, the race condition should not be there.

@seberg
Copy link
Member

seberg commented Dec 23, 2016

Is there any remote chance that nose runs this teardownContext based on garbage collection stuff?!

@seberg
Copy link
Member

seberg commented Dec 23, 2016

No, I am being silly probably. The suppress warning context resets warnings.showwarning before it deletes the attribute so doubt even the gc kicking in can't really create anything without something else going on. Just got no idea what else :).

@charris
Copy link
Member

charris commented Dec 23, 2016

Well, I see 33 errors with skimage-0.9.3, most from PIL or indexing, but none for suppress. How do you deal with all the other errors?

@sciunto
Copy link

sciunto commented Dec 23, 2016

@charris The issue is reported for 0.12.3, not 0.9.3. :)

@charris
Copy link
Member

charris commented Dec 23, 2016

OK, finally got that pulled down from upstream, now 39 errors and a ton of deprecations. Most of the warnings seem due to the fact that tests think they are running in QT instead of Wayland, might be a configuration problem here. I also need to run the tests as

python -c'import skimage; skimage.test()'

as nosetests doesn't work at all.

@sciunto
Copy link

sciunto commented Dec 23, 2016

Deprecations are normal. We still check our old API, even if a deprecation is expected. About errors, it's not normal. Would you mind reporting them on our (scikit-image) bug tracker please?

@charris
Copy link
Member

charris commented Dec 23, 2016

@sciunto The docs could use instructions for testing locally.

@charris
Copy link
Member

charris commented Dec 23, 2016

OK, I see it in the same place. The test is

def test_vectorized_integrate():
    r0 = np.array([12, 0, 0, 10, 0, 10, 30])
    c0 = np.array([10, 0, 10, 0, 0, 10, 31])
    r1 = np.array([23, 19, 19, 19, 0, 10, 49])
    c1 = np.array([19, 19, 19, 19, 0, 10, 49])

    expected = np.array([x[12:24, 10:20].sum(),
                         x[:20, :20].sum(),
                         x[:20, 10:20].sum(),
                         x[10:20, :20].sum(),
                         x[0,0],
                         x[10, 10],
                         x[30:, 31:].sum()])
    start_pts = [(r0[i], c0[i]) for i in range(len(r0))]
    end_pts = [(r1[i], c1[i]) for i in range(len(r0))]
    assert_equal(expected, integrate(s, r0, c0, r1, c1))  # test deprecated
    assert_equal(expected, integrate(s, start_pts, end_pts))

The # test deprecated comment suggests to me that perhaps the test needs some fixing, but still suppress_warnings should fail more gracefully.

@charris
Copy link
Member

charris commented Dec 23, 2016

Bit more information, doing

$ python skimage/transform/tests/test_integral.py

Which uses NumPy run_module_suite doesn't seem to fail, but issues the warning

======================================================================
ERROR: skimage.transform.tests.test_integral.test_vectorized_integrate
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/charris/Workspace/scikit-image/skimage/transform/tests/test_integral.py", line 46, in test_vectorized_integrate
    assert_equal(expected, integrate(s, r0, c0, r1, c1))  # test deprecated
  File "/home/charris/Workspace/scikit-image/skimage/transform/integral.py", line 86, in integrate
    warn("The syntax 'integrate(ii, r0, c0, r1, c1)' is "
  File "/home/charris/Workspace/scikit-image/skimage/_shared/_warnings.py", line 16, in warn
    warnings.warn(message, stacklevel=stacklevel)
UserWarning: The syntax 'integrate(ii, r0, c0, r1, c1)' is deprecated, and will be phased out in release 0.14. The new syntax is 'integrate(ii, (r0, c0), (r1, c1))'.

@charris
Copy link
Member

charris commented Dec 23, 2016

Note that scikit-image has it's own context manager for warnings in skimage/_shared/_warnings.py. That may be in conflict with suppress_warnings.

Out of curiosity, I wonder if it makes a difference to explicitly specify DeprecationWarning instead of using the default UserWarning.

@charris
Copy link
Member

charris commented Dec 23, 2016

So I don't see errors with

$ nosetests-2.7 skimage/transform/tests/test_integral.py |& grep orig_show

Whereas

$ nosetests-2.7 skimage/transform/tests/ |& grep orig_show

Shows it about 25% of the time. That suggests that the source of the error lies elsewhere and the failing test just exposes it.

@charris
Copy link
Member

charris commented Dec 23, 2016

In particular, skimage/transform/tests/test_geometric.py contains the warnings context manager expected_warnings and that makes me suspicious.

@charris
Copy link
Member

charris commented Dec 24, 2016

OK, now I suspect the test_parallel decorator, https://github.com/scikit-image/scikit-image/blob/master/skimage/_shared/testing.py . That defaults to two threads.

@charris
Copy link
Member

charris commented Dec 24, 2016

If I remove the three files importing test_parallel, there is no longer a problem.

EDIT: And now I can't reproduce the problem at all. Hmm... Maybe also depends on what else is running on the machine.

@seberg
Copy link
Member

seberg commented Dec 24, 2016

Hmmm, the test_parallel could be it I guess, though looking through the code, none of the functions obviously uses a warning context, though its not impossible I guess.

@charris
Copy link
Member

charris commented Dec 24, 2016

something in test_hough_transform.py seems to be the initiator. With that file removed there is no error.

EDIT: Perhaps because is precedes the test_integral.py module?

@charris
Copy link
Member

charris commented Dec 26, 2016

Hmm, putting it all in isnan is an interesting thought but I suspect it might cause trouble at this time, maybe later? @njsmith @juliantaylor Thoughts?

@seberg
Copy link
Member

seberg commented Dec 26, 2016

@shoyer may have an opinion as well.

I just implemented isnat, but I am not sure about all the timing stuff with respect to the release. If we have to try to fix this in a minimal version, the better option is probably to plug in a python isnat version into the test suit, though fixing it may turn out to be a bit tricky in any case (at the very least others will get more warnings then before, though maybe few 1. use our test suit stuff and 2. actually test warnings carefully).

@charris
Copy link
Member

charris commented Dec 26, 2016

I don't consider the failure as terribly severe but it would be nice to have it fixed. A (private?) python version of isnat would be fine with me for 1.12.

@charris
Copy link
Member

charris commented Dec 26, 2016

The suppress_warnings docstring should also mention that it is not thread safe.

@seberg
Copy link
Member

seberg commented Dec 26, 2016

It does :)

@sandrotosi
Copy link
Contributor

thanks a lot guys for jumping on the issue during holidays and for the thorough debug

is there anything i can do to implement/test a solution for this? debian is eager to get a fix for this :)

@seberg
Copy link
Member

seberg commented Dec 29, 2016

@sandrotosi, unfortunatly it is a bit tricky maybe, the simplest fix may be to simply not use the parallel testing stuff in skimage, but that somewhat defeats the purpose. We can pretty simply do something like in my isnat stuff (using a private python version of isnat). However, I am not quite certain that it may not create test regressions elsewhere :/.

Adding a mutex for those two cases may actually be a plausible hack which should remove the trouble and seems unlikely to go bad (since you would not call assert_equal or similar from inside of assert_equal). I would not actually use that in numpy master, but as a minimal bugfix for 1.12 it may be a real option. And it would not interfere with the skimage tests itself.

@sandrotosi
Copy link
Contributor

@olebole would disable parallel testing in skimage be an acceptable temporary solution?

@seberg yeah i think the target might be to have a minimal fix for 1.12 and eventually address it in a more complete/comprehensive way in master - that at least would let the next debian release have numpy without this issue

@seberg
Copy link
Member

seberg commented Dec 29, 2016

@sandrotosi I tried the lock approach in gh-8427. I am not whether or not the thought is nuts, but if someone wants to try it....

@shoyer
Copy link
Member

shoyer commented Dec 29, 2016

Is there some way we could simply modify suppress_warnings to have undefined behavior but not crash when used in a multi-threaded fashion?

(I have some ideas, but I'm still thinking through how exactly it could work.)

@seberg
Copy link
Member

seberg commented Dec 29, 2016

Sure, we can just not delete the attribute, but it might create bugs in the tests lateron... Though I guess most test suits are not as nitpicky as numpy about testing warnings, so....

seberg added a commit to seberg/numpy that referenced this issue Dec 30, 2016
This lock prevents unsafe warning filter manipulations during testing
if downstream packages do parallel testing.

Warning filtering is generally not threadsafe in python, this is also
true for `catch_warnings` or `suppress_warinings`. In NumPy 1.12
however, `assert_equal` and the array comparison asserts, use this
to filter out some comparison warnings. Since removing this filter
may also affect downstream projects and skimage (and possibly more)
do parallel manual parallel testing using `assert_equal`, a quick fix
seems to be to lock the less obvious threading trap. Ideally (in
master this is the case), there should simply not be warning filter
logic in the assert functions themself.
While probably not perfectly safe in principle, it is sufficient
in the case of skimage and probably most testing scenarios and the
chance of deadlocks seems very unlikely.

Closes numpygh-8413
@sandrotosi
Copy link
Contributor

I've uploaded 1.12.0rc2 (which contains #8427) to debian and rebuilt 3 times skimage (a slightly old version of teh debian package, without the test suite completely disabled) and all the times it built succesfully.

thanks a lot guys for working on this during the holidays!

Now, any plans for the final 1.12.0 release? :)

@charris
Copy link
Member

charris commented Jan 6, 2017

I plan on making the final release Jan 15.

@olebole
Copy link
Author

olebole commented Jan 7, 2017

I can confirm that it works with rc2. Thank you very much for your efforts!

@charris
Copy link
Member

charris commented Jan 7, 2017

I'm leaving this open until it is fixed in master.

@charris
Copy link
Member

charris commented May 5, 2017

Hmm, not fixed in master. @seberg Is it correct that #8421 will close this?

@charris
Copy link
Member

charris commented May 7, 2017

Fixed by #8421.

@gregorybchris
Copy link

Using numpy version 1.23.5 and seeing AttributeError: 'suppress_warnings' object has no attribute '_orig_show' when suppressing warnings in a multithreaded context.

Just wanted to share some code here to reproduce this behavior since resources are sparse elsewhere. I ended up resolving my issue by not using the suppress_warnings context manager while multithreading.

import warnings
from concurrent.futures import ThreadPoolExecutor
import numpy as np

def compute_metrics(iteration: int):
    with np.testing.suppress_warnings():
        warnings.warn("Test warning")
    return f"thread {iteration} done"

with ThreadPoolExecutor(max_workers=10) as executor:
    futures = []
    for i in range(100):
        future = executor.submit(compute_metrics, i)
        futures.append(future)
    for future in futures:
        result = future.result()
        print(result)

@seberg
Copy link
Member

seberg commented Sep 4, 2023

It's documented to not be thread-safe and neither is warnings.catch_warnings to my knowledge (Python would need to grow thread-safe warning contexts). It also is meant for testing to work around problems that have been fixed years ago in Python and is or should be considered deprecated (yes it can do a bit more but its main reason to exist doesn't matter much).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants