New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saving large table FITS may lead to a truncated file #4307
Conversation
The script result given above was obtained using astropy 1.0.5. With the development version of astropy on my laptop, I have the problem of truncated files when “opening and saving” but I can not run the script. |
What platform is this on? |
Nevermind, I saw the thread on the mailing list about this. It occurs to me that this might be related to #1380. What happens if you open the file with |
Hi Erik, |
What traceback do you get with memmap=False? What about |
Using memmap=False leads to the same traceback. But if I open the file in update mode and save it, the resulting file is not truncated. |
io/fits/util.py _write_string doesn't make sure the full dataset has been written. |
@juliantaylor - ah, does it need a |
no it needs:
and technically EINTR handling if python < 3.5 |
@yannick1974 That doesn't sound right, since the last line in the traceback above is in code that is never entered if memmap=False. |
@juliantaylor That seems right, though that code isn't used except in a few smaller writes--it is not used to write data arrays--those generally go through |
it is used by the posted script |
Oh, well it's not supposed to be. That appears to be a bug. |
…file to be skipped over on a direct copy write
2870a1a
to
df93272
Compare
I think the attached fix should work around the main problem with truncation--how does this look for you @yannick1974 ? |
the write_string issue should also be fixed, its a bug even when its not triggered by the code with this fix anymore |
the fix works for me |
Agreed! I might as well go ahead and add that to this PR. |
Hi Erik, |
@juliantaylor @yannick1974 Excellent, thanks for confirming. |
@embray - can you add a changelog entry and merge? |
… that the original problem is fixed this is unlikely to ever be an issue, since _write_string should mostly just be used for writing headers and other smaller strings). Added changelog entry for 1.0.7.
…use file.write does not have a return value. Will reconsider something like this later if the need arises.
Saving large table FITS may lead to a truncated file
Saving large table FITS may lead to a truncated file
Saving large table FITS may lead to a truncated file
While handling large catalogue FITS files, I sometimes encounter problems when saving them: the newly saved file is truncated whereas no error is raised at saving time. For instance, I have a 37 GiB file and if I open it and save it just after, this leads to a 2 GiB file.
I tried to write a script to generate a big FITS file and expose the bug. During this, I found that if read a big file and count the rows before saving it, the newly save file is OK.
Here is the script. Of course, you need a lot of memory and a lot of disk space to run it.
The output of the script is:
The main problem is that astropy does not raise an error when the saving leads to a truncated file. The second problem is that astropy should be able to handle large files.
I have no idea on how to debug this but I'm willing to help.
Yannick