Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: test_big_arrays (test_io.TestSavezLoad) on OS X + Python 3.3 #3858

Closed
rgommers opened this issue Oct 3, 2013 · 29 comments
Closed

ERROR: test_big_arrays (test_io.TestSavezLoad) on OS X + Python 3.3 #3858

rgommers opened this issue Oct 3, 2013 · 29 comments

Comments

@rgommers
Copy link
Member

rgommers commented Oct 3, 2013

Reported by Piet van Oostrum on the mailing list against 1.8.0rc1 on OS X with Python 3.3:

======================================================================
ERROR: test_big_arrays (test_io.TestSavezLoad)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/numpy/testing/decorators.py", line 146, in skipper_func
    return f(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/numpy/lib/tests/test_io.py", line 149, in test_big_arrays
    np.savez(tmp, a=a)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/numpy/lib/npyio.py", line 530, in savez
    _savez(file, args, kwds, False)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/numpy/lib/npyio.py", line 589, in _savez
    format.write_array(fid, np.asanyarray(val))
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/numpy/lib/format.py", line 417, in write_array
    fp.write(array.tostring('C'))
OSError: [Errno 22] Invalid argument
@rgommers
Copy link
Member Author

rgommers commented Oct 3, 2013

I can reproduce this. Looks like a Python 3.x bug.

import os
import sys
import time


if sys.maxsize > 2**32:
    print('64-bit')
else:
    print('32-bit, exiting')
    sys.exit(0)

fname = 'write_large_bytestring.txt'
tmp = open(fname, 'wb')
try:
    L = (1 << 31) + 100000
    tmp.write(b'abc' * 2**32)
finally:
    tmp.close()
    os.remove(fname)
    print('Elapsed time: %s s' % time.clock())

The above works with Python 2.7 but not with 3.3:

$ python tmp3.py 
64-bit
Elapsed time: 7.896957 s

$ python3.3 tmp3.py 
64-bit
Elapsed time: 50.149956 s
Traceback (most recent call last):
  File "tmp3.py", line 16, in <module>
    tmp.write(b'abc' * 2**32)
OSError: [Errno 22] Invalid argument

$ ulimit
unlimited

Both Python are installed from the dmgs on python.org. I can't find an issue for this on bugs.python.org but IIRC the io module was completely rewritten.

@rgommers
Copy link
Member Author

rgommers commented Oct 3, 2013

Test introduced in gh-2942.

@pv
Copy link
Member

pv commented Oct 3, 2013

Or maybe this is another OSX I/O bug? Remember, OSX libc is buggy and has issues in fwrite/fread when dealing with data blocks close to 2**32, which we had to work around in tofile/fromfile...

@njsmith
Copy link
Member

njsmith commented Oct 3, 2013

In any case, we obviously have to work around it by splitting up the write
into smaller chunks, right?

(Even if it's ultimately libc's fault, python should probably work around
it itself - possibly 2.7 had code to do this that got lost in the
transition.)
On 3 Oct 2013 10:03, "Pauli Virtanen" notifications@github.com wrote:

Or maybe this is another OSX I/O bug? Remember, OSX libc is buggy and has
issues in fwrite/fread when dealing with data blocks close to 2**32...


Reply to this email directly or view it on GitHubhttps://github.com//issues/3858#issuecomment-25607852
.

@pv
Copy link
Member

pv commented Oct 3, 2013

Re: gh-574 and gh-2806 and gh-3473 The OSX version in question may be relevant, maybe it now fails rather than writing garbage like it did previously?

@pv
Copy link
Member

pv commented Oct 3, 2013

Yes, we can work around it by chunking. Maybe this issue should also be forwarded to Python devs, so that they could also implement chunking themselves...

@rgommers
Copy link
Member Author

rgommers commented Oct 3, 2013

Ah, forgot about those issues. Tried to test on my 10.6 machine, but there the same script just hangs. Could be due to the hardware though, it's an ancient machine.

@charris
Copy link
Member

charris commented Oct 6, 2013

Is this a 1.8.0 blocker? I'm going to put it there just so it isn't forgotten, it can always be removed. Does anyone know if it works for Python 3.2?

@charris
Copy link
Member

charris commented Oct 6, 2013

Also, IIRC, we've only chunked reads, it may be that a test that writes a large file is broken.

@rgommers
Copy link
Member Author

rgommers commented Oct 6, 2013

I wouldn't hold up the release for this one, it's not a regression. Mark it knownfail though in the 1.8.x branch if it's not failed before the release.

@charris
Copy link
Member

charris commented Oct 6, 2013

@rgommers Does if fail only with OSX and 3.3? You say that it works for python 2.7, is that correct?

@rgommers
Copy link
Member Author

rgommers commented Oct 6, 2013

It doesn't fail for 2.7, but the test doesn't check that what's written to file is correct. It's likely not to be correct, see other issues that Pauli linked.

@charris
Copy link
Member

charris commented Oct 6, 2013

@rgommers So OSX in general. I'll leave it open in 1.9-devel to motivate a fix and open an issue.

@charris
Copy link
Member

charris commented Feb 24, 2014

This is reported fixed in OS X Mavericks. This is probably won't fix, as the proper fix is to upgrade the OS.

See also #2931.

@charris
Copy link
Member

charris commented May 5, 2014

Closing, should be fixed by Mavericks. Please reopen if the problem persists.

@charris charris closed this as completed May 5, 2014
@djsutherland
Copy link

This is still happening for me on Mavericks with numpy 1.8.1 and python 3.4 (also 3.3) from Anaconda; if I comment out the skipif decorator, https://github.com/numpy/numpy/blob/v1.8.1/numpy/lib/tests/test_io.py#L154 fails.

The test passes, and data appears to be loaded correctly, using python 2.7 from Anaconda.

This isn't necessarily a flaw with numpy, but others appear to be working around this or similar issues, e.g. torch/DEPRECEATED-torch7-distro@40e6593 (which is for reading rather than writing, but I'm also unable to read large files on python 3).

@AndreasMadsen
Copy link

Running the testcase using Python 3.4 with numpy 1.8.1 on Mac OS X 10.9.4 (Mavericks) results in the known OSError: [Errno 22] Invalid argument error. @certik Please reopen issue!

@makmanalp
Copy link

Another data point - using OSX 10.9.5 (Mavericks) and I get the same issue. I just saw this bug in the python tracker: https://bugs.python.org/issue24658

@charris charris reopened this Jul 20, 2015
@charris
Copy link
Member

charris commented Jul 20, 2015

Might as well reopen this. I don't know if it will be fixed when Python solves their part, but we will find out.

@adammenges
Copy link

Happens here, latest macOS, python (3.6), and numpy.

@danlou
Copy link

danlou commented Sep 26, 2017

FYI Still seeing this bug on latest macOS (10.12.6) and Python 3.5.2 (Anaconda 4.2.0).
Has anyone figured out an upper limit on chunk sizes?

@divyansha
Copy link

Getting this error as well, macOS Sierra, python 3.6, numpy

@charris
Copy link
Member

charris commented Nov 29, 2017

There looks to be some motion on the Python issue, but perhaps we should just go ahead and chunk the writes.

@charris charris modified the milestones: 1.9 blockers, 1.15.0 release Nov 29, 2017
@charris
Copy link
Member

charris commented Jun 7, 2018

Looks like this isn't going to get fixed upstream anytime soon. Anyone know the latest status?

@charris charris removed this from the 1.15.0 release milestone Jun 7, 2018
@charris charris added this to the 1.16.0 release milestone Jun 7, 2018
@matrixise
Copy link

matrixise commented Oct 19, 2018

Hi all,

Could you try this issue with the last version of Python 3.6, 3.7 and 3.8a because I think to have fixed the issue on OSX with this PR (python/cpython#1705).

I have an other PR for 2.7, but this one is not yet ready :/

Thank you for your feedback.

@charris
Copy link
Member

charris commented Nov 17, 2018

@rgommers Any chance you can test this? Any other feedback on the current status of this would be welcome.

@charris charris modified the milestones: 1.16.0 release, 1.17.0 release Nov 17, 2018
@rgommers
Copy link
Member Author

The test_big_arrays test passes, but #3858 (comment) still fails for me with the latest Python 3.6 shipped by Anaconda. That probably doesn't have the CPython fix though. No time to build Python myself right now, sorry.

@charris
Copy link
Member

charris commented May 22, 2019

@rgommers can you revisit this?

@rgommers
Copy link
Member Author

This is indeed fixed as far as I can tell, at least with Python 3.7 from Anaconda. No other reports either, so closing. Thanks everyone, and @matrixise in particular for fixing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests