Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve fits primary header when writing out spectrum1d without wcs #1105

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

cshanahan1
Copy link
Contributor

(this is a suuuuper draft pr, just opening to track work done at Spectro 2023 hack day with @kelle)

The current issue (that we have been trying to wrap our heads around in #1096, #905, and has become clearer to us as we've explored this) is that tabular-fits writer, which converts Spectrum1D into a table and uses Table.write to save to file, does not support writing out to the PrimaryHDU. The alternative writer available for Spectrum1D, `wcs1d-fits', sort of does what we want (which is, a primary HDU with a header and a BinTableHDU in the 1st extension with spectrum info) but requires a valid wcs to work (since the initial HDUList is created from this WCS)

The goal of this PR is to bypass the tabular-fits writer to enable writing out meta.header info to the primary HDU when there is NO wcs.

Copy link

codecov bot commented Nov 15, 2023

Codecov Report

Attention: 47 lines in your changes are missing coverage. Please review.

Comparison is base (5098f23) 70.73% compared to head (1b9ccf5) 70.05%.
Report is 3 commits behind head on main.

Files Patch % Lines
specutils/io/default_loaders/mef_tabular_fits.py 12.96% 47 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1105      +/-   ##
==========================================
- Coverage   70.73%   70.05%   -0.69%     
==========================================
  Files          64       65       +1     
  Lines        4483     4538      +55     
==========================================
+ Hits         3171     3179       +8     
- Misses       1312     1359      +47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dhomeier
Copy link
Contributor

dhomeier commented Nov 16, 2023

The current issue (that we have been trying to wrap our heads around in #1096, #905, and has become clearer to us as we've explored this) is that tabular-fits writer, which converts Spectrum1D into a table and uses Table.write to save to file, does not support writing out to the PrimaryHDU.

You cannot write tables to the PrimaryHDU, the standard forbids anything but IMAGE HDUs there. So yes, to store the data there you need a WCS, even if it's just the simplest case of CRVAL1, CRPIX1, CDELT1 for constant sampling.

In principle you could probably get your info written to the Primary header by first creating an empty fits.PrimaryHDU() and then directly write to its header (could possibly use you primary_header_initial directly), then adopting

if update_header:
hdr_types = (str, int, float, complex, bool,
np.floating, np.integer, np.complexfloating, np.bool_)
header.update([keyword for keyword in spectrum.meta.items() if
(isinstance(keyword[1], hdr_types) and
keyword[0] not in ('NAXIS', 'NAXIS1', 'NAXIS2'))])

instead of passing it to Table(..., meta=header). Your approach of passing the table (without a meta) to BinTABLE() looks like the right approach.

@cshanahan1
Copy link
Contributor Author

Ultimately this PR is just a hack around the fact that BinTableHDU cannot be written to the 0th extension, as you mentioned @dhomeier. The only desired behavior (by both what this PR may become and by the existing wcs1d-fits) is to be able to control the placement of header data.

wcs1d-fits is similar and hacks around this limitation by creating an HDUList from a wcs object. It then allows you to choose what extension this WCS header (and other metadata on your Spectrum1D) is written to, but the table that encodes flux/wave/uncert is stored in a non-0th (in fact, i think its ALWAYS the 1st since Spectrum1D.write means writing one spectrum, which belongs in one extension...). It goes through the Table machinery to convert Spectrum1d into a BinTableHDU, but doesn't use Table.write like the simply tabular-fits does.

This PR is basically the same thing, but it is for the case where there is no 1D WCS and the HDUList needs to be created from scratch. For this reason, I don't see the case for both of these existing - we need a 'tabular' reader/writer that contains the correct logic for headers (and therefore WCS). Theres also the related issue of where header data should be expected to be stored, which is not consistant amongst the reader/writers (#1102)

I can get this working to solve the case of 'I have a Spectrum1D without a WCS, and I want to stuff it in a fits file with a primary header', but I think there are some further concerns with this that need to be discussed.

@dhomeier
Copy link
Contributor

dhomeier commented Nov 20, 2023

wcs1d-fits is similar and hacks around this limitation by creating an HDUList from a wcs object. It then allows you to choose what extension this WCS header (and other metadata on your Spectrum1D) is written to, but the table that encodes

I probably would not call the way wcs1d-fits is handling this a hack – as it is writing header and data to an ImageHDU, it simply can (and by default does) write everything together to the PrimaryHDU, so there is normally no concern about having to store anything in a different HDU to begin with.

flux/wave/uncert is stored in a non-0th (in fact, i think its ALWAYS the 1st since Spectrum1D.write means writing one spectrum, which belongs in one extension...). It goes through the Table machinery to convert Spectrum1d into a BinTableHDU, but doesn't use Table.write like the simply tabular-fits does.

I am not sure I can follow you here – are you still referring to the wcs1d writer? This does not make use of the Table interface at all (nor do any of the readers), and it also does not create any table-type HDUs. Other components like uncertainty or mask are written to additional ImageHDUs; as there is no straightforward or efficient way to encode these in a single image array, a Spectrum1D in this case is stored in multiple extensions.

This PR is basically the same thing, but it is for the case where there is no 1D WCS and the HDUList needs to be created from scratch. For this reason, I don't see the case for both of these existing - we need a 'tabular' reader/writer that contains the correct logic for headers (and therefore WCS). Theres also the related issue of where header data should be expected to be stored, which is not consistant amongst the reader/writers (#1102)

If you are talking about the new loader from this PR and the old tabular_fits loader, I agree there should be no need to have both of them, except that it may be easier to provide backwards compatibility for a transition time by keeping the old writer available. Not sure if a tabular reader/writer actually needs to be able to handle WCS logic in the header – the only case I can imagine where this could occur in a tabular Spectrum1D would be if there existed a pure spatial/celestial WCS in addition. But that could probably be simply passed through without any processing.
The issue of a consistent scheme for storing header info in Spectrum1D.meta remains, and certainly both writers should ultimately follow what is concluded in #617 (comment).
To provide some background: when I initially wrote wcs1d_fits_writer the design goal was simply to be able to write a format that the already existing wcs1d_fits_loader would expect and be able to read, and ideally be able to fully roundtrip on read/write. That loader was storing the header in spectrum.meta['header'], and there was no obvious majority for changing either this or the method used in tabular_fits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants