Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handled Spectrum1D images across classes #144

Merged
merged 12 commits into from
Nov 29, 2022

Conversation

ojustino
Copy link
Contributor

@ojustino ojustino commented Sep 30, 2022

I made changes to Background and the Trace classes in particular.

I created an __init__() a private method for Trace and put the check for Spectrum1D there so it can be propagated out to the children classes through super(). I'm not sure if I should be using __post_init__() instead.

I didn't realize that Spectrum1D is a child class of NDData, so most of the checks for the latter in Background and the extraction classes already captured them. I added code in Background to fix #129 and made the docstrings on image typing consistent across all classes.

I hope this fixes #130, but I wasn't able to reproduce the strange behavior with units reported there.

@ojustino ojustino marked this pull request as draft September 30, 2022 14:22
@ojustino ojustino force-pushed the accept-spectrum1d branch 2 times, most recently from ecce188 to 1902445 Compare September 30, 2022 14:30
@codecov
Copy link

codecov bot commented Sep 30, 2022

Codecov Report

Merging #144 (90b6460) into main (2a63e7d) will increase coverage by 2.86%.
The diff coverage is 91.37%.

@@            Coverage Diff             @@
##             main     #144      +/-   ##
==========================================
+ Coverage   72.60%   75.46%   +2.86%     
==========================================
  Files           9        9              
  Lines         646      693      +47     
==========================================
+ Hits          469      523      +54     
+ Misses        177      170       -7     
Impacted Files Coverage Δ
specreduce/background.py 88.29% <83.33%> (-0.99%) ⬇️
specreduce/extract.py 95.30% <92.15%> (+6.98%) ⬆️
specreduce/core.py 74.35% <95.65%> (+27.30%) ⬆️
specreduce/tracing.py 93.54% <100.00%> (+0.78%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ojustino ojustino marked this pull request as ready for review September 30, 2022 14:40
@ojustino
Copy link
Contributor Author

As far as I can tell, this approach works with images that are bare numpy arrays, along with Spectrum1D, NDData, and CCDData objects. It doesn't work with images that are Quantity objects due to because various places in our code compare or do arithmetic with the image and another unit-less array.

Should the Quantity scenario be fixed or should we mandate that users provide an NDData-inheriting object if they want their image to have units?

Copy link
Member

@kecnry kecnry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few questions on the implementation and then I will try to robustly test and see if I can reproduce any of the previous units issues.

All of the extra logic on how to handle the unit makes me wonder if we should only support passing the unit in with the image (as either a NDData or Spectrum1D or quantity array) and removing the unit input parameter entirely. Any thoughts on that?

specreduce/extract.py Outdated Show resolved Hide resolved
specreduce/background.py Outdated Show resolved Hide resolved
specreduce/tracing.py Outdated Show resolved Hide resolved
@ojustino ojustino force-pushed the accept-spectrum1d branch 2 times, most recently from cfe8bb5 to 945fb1c Compare October 3, 2022 21:41
@ojustino
Copy link
Contributor Author

ojustino commented Nov 8, 2022

I added a parser to all trace, extraction, and background classes so they can handle input Spectrum1D and Quantity type images in addition to their existing support for CCDData, NDData, and numpy arrays. Most parsers are the same, though HorneExtract's is more stringent. I could try to iron out the differences if we would rather have all classes inherit the parser from SpecreduceOperation.

The image attribute is now always of type Spectrum1D. I went this route because Spectrum1D has spectral axis information that would be lost with any other type. Inputs that lack spectral axes are just given an arange of values in pixel units.

Some questions:

  1. Before, obj.image was stored in our classes the way it was introduced as an argument. Is it an annoying user experience to have it converted to Spectrum1D?
  2. As a consequence of the image attribute's type change, obj.image is now a conglomerate of information instead of just the flux. This makes getting to the flux for plotting, arithmetic, etc. a bit more cumbersome for those who don't know Spectrum1D (e.g., from plt.imshow(obj.image) before to plt.imshow(obj.image.data) now). Are we OK with that?
  3. Would this be the time update _to_spectrum1d_pixels() so it takes the predefined spectral axis from the image attribute instead of creating a spectral axis of its own?
  4. Should flux units for images that lack them continue to be counts or switch to blank units (u.Unit())? Most classes currently default to counts, though HorneExtract defaults to blank units.

I will push some new tests of image input typing soon. I'll wait on adjusting the documentation until it's clear that the new approach is acceptable.

@ojustino
Copy link
Contributor Author

ojustino commented Nov 8, 2022

Looks like I have some failures to address before I introduce the new tests. Thoughts on the new approach to typing are still welcome.

Copy link
Member

@kecnry kecnry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image attribute is now always of type Spectrum1D. I went this route because Spectrum1D has spectral axis information that would be lost with any other type. Inputs that lack spectral axes are just given an arange of values in pixel units.

I think this is a good idea and makes sense with the goal to integrate more tightly with specutils... but share your concerns below.

Before, obj.image was stored in our classes the way it was introduced as an argument. Is it an annoying user experience to have it converted to Spectrum1D?
As a consequence of the image attribute's type change, obj.image is now a conglomerate of information instead of just the flux. This makes getting to the flux for plotting, arithmetic, etc. a bit more cumbersome for those who don't know Spectrum1D (e.g., from plt.imshow(obj.image) before to plt.imshow(obj.image.data) now). Are we OK with that?

An alternative, I suppose, might be to store the user-provided object and then do the conversion (_parse_image) as part of a (cached) property. That way both formats are accessible to the user, but internally we can rely on a consistent type when calling the property.

@@ -86,17 +122,18 @@ def _to_trace(trace):
raise ValueError('trace_object.trace_pos must be >= 1')
return trace

self._parse_image()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having to do this to coerce the input is quite related to the discussion in #126 (comment) (nothing that needs to hold up this PR, but something to keep in mind in that discussion and the implementation here might need to be modified down the road depending on any decisions there).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I tried your idea of storing image as a cached_property, but I couldn't figure out a way to do it without either a) changing the name of the image argument, which I think would be unintuitive, or b) including a dummy argument with InitVar to be converted to image later on. The dummy argument either shows up in the automatic docstring (confusing to users) or isn't passed to the __init__.

Maybe there's some potential crossover with #147 on this topic?

Also, it looks like cached_property requires Python ≥3.8. We currently allow 3.7, but the specutils requirements I mention in my summary below would give us an implicit requirement of 3.8, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we haven't yet, we should officially update our python requirement to ≥3.8. especially since it's implicitly required by dependencies.

specreduce/extract.py Outdated Show resolved Hide resolved
@ojustino
Copy link
Contributor Author

I had more to do since my last messages than I thought:

  • I created the _ImageParser to serve as an ancestor class that houses the code for converting user-provided images to consistently formatted Spectrum1D objects in all the operations. I'm open to other design suggestions on inheritance if this isn't ideal.
  • I made a lot of modifications to Background and existing tests so they play better with the switch to Spectrum1D images.
  • I wrote new tests that specifically cover the image parsing process. Images with the same flux, uncertainty, etc. but in different formats should be exactly the same once they've gone through _parse_image(). Image objects that lack certain attributes (e.g., CCDData objects with mask == None or that numpy arrays always lack an uncertainty) should see them added to the parsed image object in a consistent manner.
  • I added a requirement of specutils>=1.7 to avoid having NaNs appear in the spectral axis of the result when you use Spectrum1D.subtract() on an image that's been through _parse_image(), such as in Background.sub_image(). We also need asdf_astropy>=0.2.0 for that, but it looks like the specutils requirement gives us that for free.

The notebooks will need edits, but I would like to see #128 merged first to avoid merge conflicts.

Questions:

  1. Should Background.bkg_array still be a numpy array given the rest of the transition to Spectrum1D?
  2. Should _parse_image() be called in __call__ for the *Extract classes? When it happens there, you can never actually retrieve the parsed version of the original image because hrn.image is the user's unparsed input to hrn = HorneExtract(image, trace) and the result of hrn() is the extracted image as a Spectrum1D object. Does that matter?
  3. Should flux units for images that lack them be in counts or blank units (u.Unit())? Most classes currently default to counts, though HorneExtract defaults to blank units.

External dependency:

  • We need functionality from Implent NotImplemented case in Spectrum1D subtraction specutils#988 for subtraction of a Background object to work on Spectrum1D objects. This is handled in Background.__rsub__, but Spectrum1D.__sub__ has to give it the chance to work by passing the operation. I have commented out a related test in the meantime.

specreduce/background.py Outdated Show resolved Hide resolved
Copy link
Member

@kecnry kecnry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and works in the cases I've tested. This probably counts as an API break since the output types have changed, but in the past we made similar breaking changes in minor releases since we're still considered in active development and not adhering to semver rules... but just something to keep in mind.

@kecnry
Copy link
Member

kecnry commented Nov 17, 2022

Should Background.bkg_array still be a numpy array given the rest of the transition to Spectrum1D?

I think so since it's literally called array, but I also am not opposed to removing it (maybe with a deprecation period) and replacing it with something named more appropriately to hold a Spectrum1D object (either in this PR or down the road).

Should _parse_image() be called in __call__ for the *Extract classes? When it happens there, you can never actually retrieve the parsed version of the original image because hrn.image is the user's unparsed input to hrn = HorneExtract(image, trace) and the result of hrn() is the extracted image as a Spectrum1D object. Does that matter?

I'm a bit torn on this one. I proposed the cached property idea, but that seemed to not play nicely with dataclasses. Perhaps if we decide to abandon dataclasses (see #126 (comment)), then that is a direction we could take in the future (in which case we could have hrn.orig_image be the original input and then hrn.image be a cached property where the parsing takes place and is made available to __call__).

Should flux units for images that lack them be in counts or blank units (u.Unit())? Most classes currently default to counts, though HorneExtract defaults to blank units.

I think this is out-of-scope for this PR, but maybe worth opening a follow-up issue so that we can discuss and decide. I agree that we should aim to be self-consistent.

Copy link
Contributor

@tepickering tepickering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impressive amount of work here! i advocated early on for the adoption of NDData as the standard data structure to use internally. this PR takes that even further by adopting a more appropriate subclass which i 💯 agree with. i appreciate the clean-up by putting the image parsing all in one place. i have a few minor comments to address and note that there are conflicts that need to be resolved before merging. coverage looks good and the tests pass so once that's taken care of, it's gtg to merge. thanks again for all the hard work on this!

CHANGES.rst Outdated
@@ -7,6 +7,10 @@ New Features
API Changes
^^^^^^^^^^^

- All operations now accept Spectrum1D and Quantity-type images. All accepted
image types are now processed internally as Spectrum1D objects. [#146]
-
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i assume this will be filled in a bit more before merging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can include that operations' image attributes will now always be Spectrum1D objects. Was there anything else you were thinking should be here? I thought many of the changes were too behind the scenes to make the changelog.

setup.cfg Outdated Show resolved Hide resolved
specreduce/core.py Outdated Show resolved Hide resolved
image with 2-D spectral image data
trace_object: Trace
image : `~astropy.nddata.NDData`-like or array-like
Image with 2-D spectral image data. Assumes cross-dispersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is confusing since crossdisp_axis can still be provided as an input...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but the extraction operations function the same way. I think it's a remnant of the desire to future-proof them for a time when we allow for more flexibility in axis arrangement. Would you agree that a solution to this is outside this PR's scope?

@@ -201,6 +210,7 @@ def one_sided(cls, image, trace_object, separation, **kwargs):
crossdisp_axis : int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this parameter being used or is the assumption hard-coded as implied in the previous docstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cls should be of type Background, so it follows the docstring's implication once the new instance is created at the end of the method.

specreduce/extract.py Outdated Show resolved Hide resolved
@ojustino
Copy link
Contributor Author

@tepickering, I believe I addressed everything you brought up that's in scope here. The failing test is due to an upstream issue with Astropy that looks like it will be resolved soon.

I do feel like this is ready for a merge... but also feel it's bad juju to merge one's own pull request with a failing test and changes requested, so I'll defer to others' judgment or until after the holiday.

@ojustino
Copy link
Contributor Author

OK, I will merge now to keep things moving.

@ojustino ojustino merged commit 0192706 into astropy:main Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants