Add some more flexibility to coadd: output array specification #351

keflavich · 2023-03-13T21:14:44Z

This enables memmap'd output, which will be needed for very large (greater-than-memory) output cubes

EDIT: it also extends reproject_and_coadd into three dimensions

codecov · 2023-03-13T21:22:35Z

Codecov Report

Attention: 33 lines in your changes are missing coverage. Please review.

Comparison is base (1a91216) 93.60% compared to head (0054941) 90.60%.

Files	Patch %	Lines
reproject/mosaicking/coadd.py	73.33%	20 Missing ⚠️
reproject/mosaicking/subset_array.py	63.33%	11 Missing ⚠️
reproject/utils.py	50.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #351      +/-   ##
==========================================
- Coverage   93.60%   90.60%   -3.01%     
==========================================
  Files          25       25              
  Lines         892      947      +55     
==========================================
+ Hits          835      858      +23     
- Misses         57       89      +32

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

astrofrog

This is useful, thanks! Just some comments below, and be sure to include some tests - thanks!

reproject/mosaicking/coadd.py

keflavich · 2023-03-13T22:45:14Z

I'm a little stymied on this:
ValueError: Chunks do not add up to shape. Got chunks=((4,), (4, 100, 101), (4, 100, 101)), shape=(4, 174, 173)
It looks like the block_size isn't playing nice with the different-shaped output; I think that relies on your WIP PR @astrofrog ?

keflavich · 2023-03-13T22:51:13Z

@astrofrog I have some questions now:

The loop for reproject_and_coadd is presently not lazy or possible to make lazy, correct? We need your daskified PR to make it lazy?

I think I'm going to move the write-to-big-array into the parent loop, because otherwise everything still has to be held in memory

keflavich · 2023-03-13T22:51:49Z

ahhh, there's good reason for that - background matching! Yikes.

keflavich · 2023-03-13T22:58:42Z

so, as is, this does coadding on a dataset-by-dataset basis - which, for cubes, means a cube-by-cube basis. It may be more efficient to do it on a plane-by-plane basis. That seems like a lot of additional refactoring work though

keflavich · 2023-03-13T23:10:35Z

There is a huge efficiency boost to be gained if we can reproject directly into the output array, but that results in overwriting the individual values instead of adding to them.

keflavich · 2023-03-13T23:16:06Z

@astrofrog I don't see any way to do this, but do you have ideas? Basically, instead of map_coordinates(..., output=array) writing directly into memory, it would do output += map_coordinates(...). That looks like it is problematic at the lowest levels, and I don't know if it would work at all for other reprojection methods.

This sacrifices flexibility even further to improve speed and robustness and reduce memory use. In the case where you don't want to match backgrounds, this approach just loads the unweighted data into an array and the footprint into another array and divides them at the end. The robustness bit is that we can flush to disk granularly.

I'm not sure this is worth pursuing further, though, because of the depth of refactor that would be needed. Maybe this is something just solved well enough by dask.

astrofrog · 2023-03-14T08:08:07Z

I'll have a think about all this! Should be able to work on it a bit this week.

keflavich · 2023-09-09T14:57:18Z

@astrofrog I've been using this PR in practice for a while now, and I just had a look into rebasing, but the codebase has diverged a ton, making this a rather challenging rebase. I'd like to go ahead with it, but this time make sure the changes can get merged. What's your feeling - is there anything blocking this if it's rebased?

keflavich · 2023-09-09T15:07:04Z

ok the rebase wasn't as bad as I thought, but there were some confusing items that I am not sure I've resolved yet.

astrofrog · 2023-09-09T15:27:41Z

I can try and prioritise this!

keflavich · 2023-09-09T17:48:36Z

with the new modes (first, last, min, max) we again face duplicating code because we need a version that does, and another version that does not, try to background-match before the final combination step. The current approach simply breaks if match_background is not on and one tries first/last/min/max. Is there a more elegant approach?

astrofrog · 2023-09-09T20:38:18Z

Will investigate!

astrofrog · 2023-09-11T09:19:38Z

To try and keep things a bit simpler I've extracted out the changes related to specifying output arrays/footprints into a separate PR: #387

astrofrog · 2023-09-11T10:35:25Z

In terms of supporting 3D arrays (and also the median combine mode), I think things are getting complicated enough that it's worth thinking about a different approach for doing the co-adding. I'm working on an experimental re-write of reproject_and_coadd which uses dask internally and will open an alternative PR soon so that we can compare and see what works best in practice.

astrofrog · 2023-09-11T12:20:06Z

See #388 for an alternative approach.

keflavich · 2023-12-03T01:29:29Z

@astrofrog I'm still using this branch in production. It's apparently working (verification in progress). I clearly ran out of time to test 388. Any further thoughts on which direction to take? Both PRs are big, I don't immediately remember/know what the reasons are for one or the other.

astrofrog · 2023-12-03T08:12:37Z

Sorry for dropping the ball on this, I'll try and see if I can wrap things up in the next week or so.

block size making

very wasteful

…nse with background matching?

keflavich · 2024-01-30T03:49:40Z

my last commit adds a hack to solve this issue: radio-astro-tools/spectral-cube#900 (should've posted that here, perhaps, but it isn't obvious to me where it came from)

…ot we were matching backgrounds

keflavich mentioned this pull request Mar 13, 2023

Dask reproject radio-astro-tools/spectral-cube#845

Draft

astrofrog requested changes Mar 13, 2023

View reviewed changes

reproject/mosaicking/coadd.py Outdated Show resolved Hide resolved

reproject/mosaicking/coadd.py Outdated Show resolved Hide resolved

reproject/mosaicking/coadd.py Outdated Show resolved Hide resolved

reproject/mosaicking/coadd.py Outdated Show resolved Hide resolved

astrofrog reviewed Mar 13, 2023

View reviewed changes

reproject/mosaicking/coadd.py Outdated Show resolved Hide resolved

keflavich commented Mar 13, 2023

View reviewed changes

reproject/mosaicking/coadd.py Show resolved Hide resolved

keflavich force-pushed the memmaped_coadd branch from a0f3111 to bde9d46 Compare September 9, 2023 14:58

astrofrog mentioned this pull request Sep 11, 2023

Add ability to specify output array and footprint in reproject_and_coadd #387

Merged

keflavich force-pushed the memmaped_coadd branch from 6f1950b to d1319d1 Compare December 3, 2023 00:42

keflavich mentioned this pull request Dec 3, 2023

WIP: Cubewcs mosaic and dask reproject radio-astro-tools/spectral-cube#894

Draft

keflavich added 4 commits January 29, 2024 22:29

add ability to specify output array in coadd, and allow specification of

f30f25f

block size making

allow reproject_and_coadd to operate on cubes

aa9d4ff

implement astrofrog suggestions

c6bb89e

extend dimensionality of reprojectedarraysubset

4ffab69

keflavich added 13 commits January 29, 2024 22:29

fix typo

84b528d

only store all the arrays in memory if matching backgrounds; otherwise

6f3d0e9

very wasteful

add a progressbar option

dbaf7d1

add progressbar as kwd

0d73ccd

change dimensional subsetting to operate on low-level wcs

3c9ad34

add a helpful error msg

ed1ea91

fix a partial failure of the rebase

746aa05

whitespace corrections (some unrelated to this PR)

f7b3f6e

make sure variable is declared

f8adbb0

apply black

f190248

apply black

07c50a5

add an if block, but I probably have to undo this

6b0c1c6

bugfix, but leaves open a question: do the min/max first/last make se…

ba28206

…nse with background matching?

keflavich force-pushed the memmaped_coadd branch from d1319d1 to ba28206 Compare January 30, 2024 03:30

hack? Needed to solve a problem that was appearing.

e44ade4

keflavich added 2 commits February 11, 2024 20:33

pretty sure the output_array values should have been set whether or n…

ba245fb

…ot we were matching backgrounds

revert 413 and make blank_pixel_value optional, but default to NaN

0054941

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add some more flexibility to coadd: output array specification #351

Add some more flexibility to coadd: output array specification #351

keflavich commented Mar 13, 2023 •

edited

codecov bot commented Mar 13, 2023 •

edited

astrofrog left a comment

keflavich commented Mar 13, 2023

keflavich commented Mar 13, 2023

keflavich commented Mar 13, 2023

keflavich commented Mar 13, 2023

keflavich commented Mar 13, 2023

keflavich commented Mar 13, 2023

astrofrog commented Mar 14, 2023

keflavich commented Sep 9, 2023

keflavich commented Sep 9, 2023

astrofrog commented Sep 9, 2023

keflavich commented Sep 9, 2023

astrofrog commented Sep 9, 2023

astrofrog commented Sep 11, 2023

astrofrog commented Sep 11, 2023

astrofrog commented Sep 11, 2023 •

edited

keflavich commented Dec 3, 2023

astrofrog commented Dec 3, 2023

keflavich commented Jan 30, 2024

Add some more flexibility to coadd: output array specification #351

Are you sure you want to change the base?

Add some more flexibility to coadd: output array specification #351

Conversation

keflavich commented Mar 13, 2023 • edited

codecov bot commented Mar 13, 2023 • edited

Codecov Report

astrofrog left a comment

Choose a reason for hiding this comment

keflavich commented Mar 13, 2023

keflavich commented Mar 13, 2023

keflavich commented Mar 13, 2023

keflavich commented Mar 13, 2023

keflavich commented Mar 13, 2023

keflavich commented Mar 13, 2023

astrofrog commented Mar 14, 2023

keflavich commented Sep 9, 2023

keflavich commented Sep 9, 2023

astrofrog commented Sep 9, 2023

keflavich commented Sep 9, 2023

astrofrog commented Sep 9, 2023

astrofrog commented Sep 11, 2023

astrofrog commented Sep 11, 2023

astrofrog commented Sep 11, 2023 • edited

keflavich commented Dec 3, 2023

astrofrog commented Dec 3, 2023

keflavich commented Jan 30, 2024

keflavich commented Mar 13, 2023 •

edited

codecov bot commented Mar 13, 2023 •

edited

astrofrog commented Sep 11, 2023 •

edited