JP-3584: Use rolling window median for TSO outlier detection #8473

emolter · 2024-05-08T21:51:43Z

Closes #8378

This PR updates the outlier detection step to use a rolling median instead of a flat median for all TSO modes, such that real time variability in the data is less likely to be flagged as outliers. The number of integrations that comprise the window for the rolling median can be specified by an input parameter to the step.

Checklist for PR authors (skip items if you don't have permissions or they are not applicable)

added entry in CHANGES.rst within the relevant release section
updated or added relevant tests
updated relevant documentation
added relevant milestone
added relevant label(s)
ran regression tests, post a link to the Jenkins job below.
How to run regression tests on a PR
All comments are resolved
Make sure the JIRA ticket is resolved properly

codecov · 2024-05-08T22:26:59Z

Codecov Report

Attention: Patch coverage is 92.66055% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 58.05%. Comparing base (781e0e0) to head (8899cda).
Report is 10 commits behind head on master.

❗ Current head 8899cda differs from pull request most recent head 3070533

Please upload reports for the commit 3070533 to get more accurate results.

Files	Patch %	Lines
jwst/outlier_detection/outlier_detection_tso.py	91.78%	6 Missing ⚠️
jwst/outlier_detection/outlier_detection.py	93.33%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #8473      +/-   ##
==========================================
+ Coverage   57.93%   58.05%   +0.11%     
==========================================
  Files         387      388       +1     
  Lines       38839    38912      +73     
==========================================
+ Hits        22502    22589      +87     
+ Misses      16337    16323      -14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

emolter · 2024-05-12T19:34:34Z

Regression tests started here.

Marking this ready for review even though I still need to figure out why the readthedocs build is failing.

emolter · 2024-05-13T13:22:53Z

the only regtest failure was also present here, so I don't think this PR causes that issue.

However, I seem to have introduced a new unit test failure with the last small bugfix. Will track that down

jwst/outlier_detection/outlier_detection.py

emolter · 2024-05-14T18:51:43Z

New regtest run started here for code changes that address comment from @braingram. The imaging mode is still refactored, but the new version should preserve the way memory is used. Let me know whether the refactor is desired - some of the refactoring is definitely extraneous to the TSO changes and could be reverted if not desired.

jwst/outlier_detection/outlier_detection.py

braingram · 2024-05-14T20:34:37Z

New regtest run started here for code changes that address comment from @braingram. The imaging mode is still refactored, but the new version should preserve the way memory is used. Let me know whether the refactor is desired - some of the refactoring is definitely extraneous to the TSO changes and could be reverted if not desired.

Thanks for putting this together. One general question I have is: what does the refactoring of the non-tso modes provide? If it's not required for the tso changes I think it makes sense to be a separate PR where the addressed issues and improvements are described.

emolter · 2024-05-14T20:54:59Z

Agree with you RE some aspects of the additional refactor, specifically the modelwise_operation decorator. I started working on it when I misunderstood exactly what was needed for the TSO modes, and then just didn't revert it when I realized the TSO modes needed something different. I do think the decorators could be a useful way to ensure that memory is taken care of responsibly when running certain operations, and I'd be curious to hear your thoughts on that. However it's a solution to a problem that doesn't exist yet, and may never exist.

It does feel a bit strange to revert the use of timewise_operation in the imaging mode, because it is needed for the TSO data and its use in both places avoids substantial duplication. But I am happy to do so if that is preferred.

I think splitting the imaging mode into its own class and inheriting from a base class makes sense to include in this PR because imaging, spec, and now TSO all use similar but distinct data processing workflows. I don't know much about the coronagraphic modes but it's possible that coronagraphy might need its own, very similar, workflow as well - this is something I have been meaning to ask @hbushouse.

braingram · 2024-05-14T21:40:48Z

Thanks for the detailed response!

Agree with you RE some aspects of the additional refactor, specifically the modelwise_operation decorator. I started working on it when I misunderstood exactly what was needed for the TSO modes, and then just didn't revert it when I realized the TSO modes needed something different. I do think the decorators could be a useful way to ensure that memory is taken care of responsibly when running certain operations, and I'd be curious to hear your thoughts on that. However it's a solution to a problem that doesn't exist yet, and may never exist.

Thanks for the clarification. I'd be happy to discuss this.

It does feel a bit strange to revert the use of timewise_operation in the imaging mode, because it is needed for the TSO data and its use in both places avoids substantial duplication. But I am happy to do so if that is preferred.

Perhaps we could discuss this also as I'm still struggling to see how the get_sections api helps with the TSO data. As an example for non-tso data if an association containing nircam images is fed into outlier detection (with the default in_memory=False and resample_data=True) the drizzled_models produced by the resampling will be a ModelContainer containing filenames. When these are passed to create_median the get_sections api is used to limit the number of in-memory models to 1 at a time while still being able to take a median across all models. If the input to create_median is instead a container of models (not filenames) the function becomes very inefficient where each model is cloned for each section.

Is the input tso data a 3D cube? If so, the entirety of the input will be read on the first call to datamodels.open. calwebb_tso3 does transform the input to a container of 2d images before passing them to outlier_detection (do you know why it doesn't pass the cube? let's assume that can be changed as part of this improvement). Since the entire cube is in memory (and is not resampled) does it make sense to compute medians on slices from the cube (based on the size of the rolling window). This should be simpler and more efficient than trying to use the get_sections api.

I think splitting the imaging mode into its own class and inheriting from a base class makes sense to include in this PR because imaging, spec, and now TSO all use similar but distinct data processing workflows. I don't know much about the coronagraphic modes but it's possible that coronagraphy might need its own, very similar, workflow as well - this is something I have been meaning to ask @hbushouse.

Splitting up the modes into sub-classes sounds great to me. I'm much less familiar with the non-image paths in this step and have been mostly focused on trying to improve the performance. I certainly do not mean to derail this PR but hopefully some of my comments have been helpful.

emolter · 2024-05-15T15:22:43Z

After a long discussion with @braingram about existing memory issues with calwebb_tso3 and calwebb_image3, it's clear that this PR is attempting to do too much. It's trying to use, and make more generic, some memory optimizations that are not necessarily working correctly anyway, e.g. this issue. The proposed path forward for this JIRA ticket is to make a very straightforward change to support the rolling median calculation, without worrying about memory at all. This requires undoing a lot of the changes in this PR.

Then for the future, in order to implement the above-mentioned issue in a way that will also support TSO data, I'll look into what kinds of large datasets might need to be processed through this step.

emolter · 2024-05-16T20:14:42Z

Yet another regtest run started here.

In this version the only refactors should be splitting some chunks of the imaging pipeline flow into their own functions to be re-used by outlier_detection_tso. I think I fixed all the other issues @braingram brought up too and I'm going to resolve those comments now.

braingram · 2024-05-16T20:31:54Z

Thanks! Please do consider all of my previous comments addressed. The code changes look good to me and feel free to ping me when the regtests finish if a review would be helpful.

emolter · 2024-05-17T13:10:58Z

All the regression test failures seem to be unrelated. This is ready for your review @hbushouse.

You may wonder why there aren't any expected regtest failures for such a change, which should affect all data processed through calwebb_tso3. The reason is that there are no regtest datasets with >25 integrations (the default rolling window), which means that as written the code will revert to using a simple median.

Another note: I did check that the unit test I wrote fails for simple median but succeeds for rolling-median. I think the fake test data are therefore similar to the real test cases where this behavior was seen.

perrygreenfield · 2024-05-21T20:46:25Z

I think that all references to the term "drizzle" in outlier_detection_tso.py should be removed since there is no drizzling involved. I realize it makes it easier to reuse code from the original source code, but it is confusing to see the odd references. The methods and functions within this file shouldn't use the term. As for the use of the build_driz_weight function from resample_utils.py, that's unfortunate (I don't think it needed that name in that file either) but you can change the import to rename it to simply build_weight

jwst/outlier_detection/outlier_detection_tso.py

perrygreenfield · 2024-05-23T18:41:22Z

So in looking at things, it seems that this aspect should be able to be handled by your PR entirely. Namely, that you change calwebb_tso3.py to simply supply a cube model to the outlier_detection step and make the necessary changes to the code to deal with that. It should simplify the outlier_detection_tso.py code. What I'm not entirely sure about is the outlier_detection.py code. Superficially it seems that a small change to the logic should permit as an argument a ModelCube instance. But I'm not sure if there are implied association-related attributes that must be dealt with in the rest of the step, but looking at the code, it doesn't seem like it is needed by anything, but I may be missing something. Brett can correct me if I'm wrong.

emolter · 2024-05-23T19:59:54Z

So in looking at things, it seems that this aspect should be able to be handled by your PR entirely. Namely, that you change calwebb_tso3.py to simply supply a cube model to the outlier_detection step and make the necessary changes to the code to deal with that. It should simplify the outlier_detection_tso.py code. What I'm not entirely sure about is the outlier_detection.py code. Superficially it seems that a small change to the logic should permit as an argument a ModelCube instance. But I'm not sure if there are implied association-related attributes that must be dealt with in the rest of the step, but looking at the code, it doesn't seem like it is needed by anything, but I may be missing something. Brett can correct me if I'm wrong.

@perrygreenfield if I understand what you're suggesting, it's to remove all the logic in calwebb_tso3 prior to the outlier detection step that builds a ModelContainer. The following is what is currently in there:

            # FIXME OutlierDetectionStep can accept a cube input and will
            # unpack it as a container which looks almost identical to the below
            # code. There are some differences:
            # - the inputs here are SlitModels
            # - outlier detection will generate drizzle weight during the unpack
            #   the below code does not
            # - the code below takes a "bitwise or" whereas outlier detection
            #   assigns over the dq array (I think this is the same as I expect
            #   outlier detection to only add dq flags).

            # Perform regular outlier detection

            input_2dmodels = ModelContainer()
            for i in range(cube.data.shape[0]):
                # convert each plane of data cube into its own array
                image = datamodels.ImageModel(data=cube.data[i],
                                              err=cube.err[i], dq=cube.dq[i])
                image.update(cube)
                image.meta.wcs = cube.meta.wcs
                input_2dmodels.append(image)

            self.log.info("Performing outlier detection on input images ...")
            input_2dmodels = self.outlier_detection(input_2dmodels)

            # Transfer updated DQ values to original input observation
            for i in range(cube.data.shape[0]):
                # Update DQ arrays with those from outlier_detection step
                cube.dq[i] = np.bitwise_or(cube.dq[i], input_2dmodels[i].dq)

That seems like a reasonable idea, especially given the long fixme comment. Then, you are suggesting that the _convert_inputs() method be bypassed inside outlier_detection for TSO data only. Is that right? If so, then I agree there should be no problems in outlier_detection.py, because we would only be bypassing _convert_inputs for TSO data and handling those data completely separately. I may need to modify or make a new version of the detect_outliers() method, which at present I have re-used.

Is everyone comfortable with the differences named here, i.e., that outlier detection still generates a weight even though drizzling is not actually done, and the bitwise_or is removed in favor of handling setting dq flags entirely within the step?

melanieclarke · 2024-05-23T20:07:20Z

Just to note - there is a related ticket (JP-3441, #8009) that requests the ability to directly run the outlier detection step on a calints file, outside the pipeline. If cube handling is moved inside the step, that should incidentally solve the other issue as well.

perrygreenfield · 2024-05-23T20:17:23Z

@emolter, yes, I think so. If it can be made to work, it would be much more straightforward, and work better with Brett's new ModelContainer replacement. I do worry a bit about unnoticed snags, but something like that should work. Right now it seems to go through a lot of unnecessary contortions to fit the current scheme. There isn't much code in detect_outliers anyway and perhaps the blot stuff can just be ignored.

braingram · 2024-05-23T20:18:53Z

I'm to blame for the long comment. I still don't fully understand what's going on but looking at it now I'm confused by the weights :-/ For the default ivm weight type this should use the var_noise. Does this exist for the cubes? If so, is the var_noise passed on to the models during the conversion to a ModelContainer in calwebb_tso3 or in outlier_detection? If not, it looks like the wht is effectively just a mask for DO_NOT_USE (1 for usable pixels, 0 for others). I think I must be missing something but will this interfere with the outlier detection?

emolter · 2024-05-23T21:05:03Z

I'm to blame for the long comment. I still don't fully understand what's going on but looking at it now I'm confused by the weights :-/ For the default ivm weight type this should use the var_noise. Does this exist for the cubes? If so, is the var_noise passed on to the models during the conversion to a ModelContainer in calwebb_tso3 or in outlier_detection? If not, it looks like the wht is effectively just a mask for DO_NOT_USE (1 for usable pixels, 0 for others). I think I must be missing something but will this interfere with the outlier detection?

I can confirm that the weights seem to just be the DQ mask, and the var_noise is not passed into the step at all, neither during calwebb_tso3 nor within _convert_inputs(). So I suppose it will indeed make a difference to bypass both, because if a CubeModel is passed into build_driz_weight() that has var_noise, then those weights will get applied. I would have to think through whether this behavior would be acceptable or correct for outlier detection - maybe someone knows off the top of their head.

perrygreenfield · 2024-05-24T13:56:19Z

I can confirm that the weights seem to just be the DQ mask, and the var_noise is not passed into the step at all, neither during calwebb_tso3 nor within _convert_inputs(). So I suppose it will indeed make a difference to bypass both, because if a CubeModel is passed into build_driz_weight() that has var_noise, then those weights will get applied. I would have to think through whether this behavior would be acceptable or correct for outlier detection - maybe someone knows off the top of their head.

I want to make sure I understand the situation. In past behavior it was just the DQ mask? And with this change, what changes with regard to this, instead it has var_noise instead of the DQ mask? In any case, is it possible the old behavior is not desired by the instrument team?

emolter · 2024-05-24T15:07:55Z

I don't know what was desired by the instrument teams initially @perrygreenfield, all I can say is it's a bit silly to call build_driz_weight() just to return a copy of the DQ mask, but indeed that is what seems to be happening (for the imaging modes too) because _convert_inputs() doesn't save the var_noise. My plan is to simply retain the current behavior, because this ticket has nothing to do with issues with the weights. But it might be worth following up in the future.

github-actions bot added outlier_detection MIRI testing regression_testing labels May 8, 2024

emolter force-pushed the JP-3584 branch from 7d3739f to f7e40d2 Compare May 10, 2024 23:15

github-actions bot added the documentation label May 10, 2024

emolter marked this pull request as ready for review May 12, 2024 19:34

emolter requested a review from a team as a code owner May 12, 2024 19:34

braingram reviewed May 13, 2024

View reviewed changes

jwst/outlier_detection/outlier_detection.py Outdated Show resolved Hide resolved

braingram reviewed May 13, 2024

View reviewed changes

jwst/outlier_detection/outlier_detection.py Outdated Show resolved Hide resolved

braingram reviewed May 14, 2024

View reviewed changes

jwst/outlier_detection/outlier_detection.py Show resolved Hide resolved

emolter added this to the Build 11.0 milestone May 14, 2024

braingram reviewed May 14, 2024

View reviewed changes

jwst/outlier_detection/outlier_detection.py Show resolved Hide resolved

braingram reviewed May 14, 2024

View reviewed changes

jwst/outlier_detection/outlier_detection.py Outdated Show resolved Hide resolved

braingram reviewed May 14, 2024

View reviewed changes

jwst/outlier_detection/outlier_detection.py Outdated Show resolved Hide resolved

braingram reviewed May 14, 2024

View reviewed changes

jwst/outlier_detection/outlier_detection.py Outdated Show resolved Hide resolved

emolter force-pushed the JP-3584 branch from d29ef75 to 0cf1dee Compare May 16, 2024 20:09

emolter requested a review from hbushouse May 17, 2024 13:06

perrygreenfield self-requested a review May 21, 2024 20:34

emolter added 3 commits May 22, 2024 09:11

reverted to minimal changes; no attempt at memory saving for TSO data

08f2af2

fixed docs build and a few leftover issues with docs

d060fab

renamed all references to drizzling in outlier_detectino_tso

8899cda

emolter force-pushed the JP-3584 branch from ba3eea3 to 8899cda Compare May 22, 2024 13:22

braingram reviewed May 22, 2024

View reviewed changes

jwst/outlier_detection/outlier_detection_tso.py Outdated Show resolved Hide resolved

fixing @braingram suggestion

3070533

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JP-3584: Use rolling window median for TSO outlier detection #8473

JP-3584: Use rolling window median for TSO outlier detection #8473

emolter commented May 8, 2024 •

edited

codecov bot commented May 8, 2024 •

edited

emolter commented May 12, 2024

emolter commented May 13, 2024

emolter commented May 14, 2024

braingram commented May 14, 2024

emolter commented May 14, 2024

braingram commented May 14, 2024

emolter commented May 15, 2024

emolter commented May 16, 2024 •

edited

braingram commented May 16, 2024

emolter commented May 17, 2024 •

edited

perrygreenfield commented May 21, 2024

perrygreenfield commented May 23, 2024

emolter commented May 23, 2024

melanieclarke commented May 23, 2024

perrygreenfield commented May 23, 2024

braingram commented May 23, 2024

emolter commented May 23, 2024

perrygreenfield commented May 24, 2024

emolter commented May 24, 2024

JP-3584: Use rolling window median for TSO outlier detection #8473

Are you sure you want to change the base?

JP-3584: Use rolling window median for TSO outlier detection #8473

Conversation

emolter commented May 8, 2024 • edited

codecov bot commented May 8, 2024 • edited

Codecov Report

emolter commented May 12, 2024

emolter commented May 13, 2024

emolter commented May 14, 2024

braingram commented May 14, 2024

emolter commented May 14, 2024

braingram commented May 14, 2024

emolter commented May 15, 2024

emolter commented May 16, 2024 • edited

braingram commented May 16, 2024

emolter commented May 17, 2024 • edited

perrygreenfield commented May 21, 2024

perrygreenfield commented May 23, 2024

emolter commented May 23, 2024

melanieclarke commented May 23, 2024

perrygreenfield commented May 23, 2024

braingram commented May 23, 2024

emolter commented May 23, 2024

perrygreenfield commented May 24, 2024

emolter commented May 24, 2024

emolter commented May 8, 2024 •

edited

codecov bot commented May 8, 2024 •

edited

emolter commented May 16, 2024 •

edited

emolter commented May 17, 2024 •

edited