Skip to content

Commit

Permalink
Merge pull request #3827 from embray/fits/issue-3827
Browse files Browse the repository at this point in the history
Clarify issues with opening lots of FITS files in FAQ
  • Loading branch information
embray committed Aug 11, 2015
1 parent c30b7e3 commit 239508d
Show file tree
Hide file tree
Showing 2 changed files with 66 additions and 1 deletion.
63 changes: 63 additions & 0 deletions docs/io/fits/appendix/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -527,6 +527,69 @@ with FITS tables some users might find the ``fitsio`` library more to their
liking.


I'm opening many FITS files in a loop and getting OSError: Too many open files
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

Say you have some code like:

.. doctest-skip::

>>> from astropy.io import fits
>>> for filename in filenames:
... hdul = fits.open(filename)
... for hdu in hdul:
... hdu_data = hdul.data
... # Do some stuff with the data
... hdul.close()
...

The details may differ, but the qualitative point is that the data to many
HDUs and/or FITS files are being accessed in a loop. This may result in
an exception like::

Traceback (most recent call last):
File "<stdin>", line 2, in <module>
OSError: [Errno 24] Too many open files: 'my_data.fits'

As explained in the :ref:`note on working with large files <fits-large-files>`,
because Astropy uses mmap by default to read the data in a FITS file, even if
you correctly close a file with `HDUList.close <astropy.io.fits.HDUList.close>`
a handle is kept open to that file so that the memory-mapped data array can
still be continued to be read transparently.

The way Numpy supports mmap is such that the file mapping is not closed until
the overlying `~numpy.ndarray` object has no references to it and is freed
memory. However, when looping over a large number of files (or even just HDUs)
rapidly, this may not happen immediately. Or in some cases if the HDU object
persists, the data array attached to it may persist too. The easiest
workaround is to *manually* delete the ``.data`` attribute on the HDU object so
that the `~numpy.ndarray` reference is freed and the mmap can be closed:

.. doctest-skip::

>>> from astropy.io import fits
>>> for filename in filenames:
... hdul = fits.open(filename)
... for hdu in hdul:
... hdu_data = hdul.data
... # Do some stuff with the data
... # ...
... # Don't need the data anymore; delete all references to it
... # so that it can be garbage collected
... del hdu_data
... del hdu.data
... hdul.close()
...

In some extreme cases files are opened and closed fast enough that Python's
garbage collector does not free them (and hence free the file handles) often
enough. To mitigate this your code can manually force a garbage collection
by calling :func:`gc.collect` at the end of the loop.

In a future release it will be easier to automatically perform this sort of
cleanup when closing FITS files, where needed.


Comparison with Other FITS Readers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
4 changes: 3 additions & 1 deletion docs/io/fits/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,8 @@ The headers will still be accessible after the HDUList is closed. The data may
or may not be accessible depending on whether the data are touched and if they
are memory-mapped, see later chapters for detail.

.. _fits-large-files:

Working with large files
""""""""""""""""""""""""

Expand All @@ -98,7 +100,7 @@ because by that point you're likely to run out of physical memory anyways), but
is opened by mmap. This means that even after calling ``hdul.close()`` the mmap still
holds an open handle to the data so that it can still be accessed by unwary programs
that were built with the assumption that the .data attribute has all the data in-memory.

In order to force the mmap to close either wait for the containing ``HDUList`` object to go
out of scope, or manually call ``del hdul[0].data`` (this works so long as there are no other
references held to the data array).
Expand Down

0 comments on commit 239508d

Please sign in to comment.