New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some more flexibility to coadd: output array specification #351
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #351 +/- ##
==========================================
- Coverage 93.60% 90.60% -3.01%
==========================================
Files 25 25
Lines 892 947 +55
==========================================
+ Hits 835 858 +23
- Misses 57 89 +32 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is useful, thanks! Just some comments below, and be sure to include some tests - thanks!
I'm a little stymied on this: |
@astrofrog I have some questions now: The loop for I think I'm going to move the write-to-big-array into the parent loop, because otherwise everything still has to be held in memory |
ahhh, there's good reason for that - background matching! Yikes. |
so, as is, this does coadding on a dataset-by-dataset basis - which, for cubes, means a cube-by-cube basis. It may be more efficient to do it on a plane-by-plane basis. That seems like a lot of additional refactoring work though |
There is a huge efficiency boost to be gained if we can reproject directly into the output array, but that results in overwriting the individual values instead of adding to them. |
@astrofrog I don't see any way to do this, but do you have ideas? Basically, instead of This sacrifices flexibility even further to improve speed and robustness and reduce memory use. In the case where you don't want to match backgrounds, this approach just loads the unweighted data into an array and the footprint into another array and divides them at the end. The robustness bit is that we can flush to disk granularly. I'm not sure this is worth pursuing further, though, because of the depth of refactor that would be needed. Maybe this is something just solved well enough by dask. |
I'll have a think about all this! Should be able to work on it a bit this week. |
@astrofrog I've been using this PR in practice for a while now, and I just had a look into rebasing, but the codebase has diverged a ton, making this a rather challenging rebase. I'd like to go ahead with it, but this time make sure the changes can get merged. What's your feeling - is there anything blocking this if it's rebased? |
a0f3111
to
bde9d46
Compare
ok the rebase wasn't as bad as I thought, but there were some confusing items that I am not sure I've resolved yet. |
I can try and prioritise this! |
with the new modes (first, last, min, max) we again face duplicating code because we need a version that does, and another version that does not, try to background-match before the final combination step. The current approach simply breaks if match_background is not on and one tries first/last/min/max. Is there a more elegant approach? |
Will investigate! |
To try and keep things a bit simpler I've extracted out the changes related to specifying output arrays/footprints into a separate PR: #387 |
In terms of supporting 3D arrays (and also the median combine mode), I think things are getting complicated enough that it's worth thinking about a different approach for doing the co-adding. I'm working on an experimental re-write of |
See #388 for an alternative approach. |
6f1950b
to
d1319d1
Compare
@astrofrog I'm still using this branch in production. It's apparently working (verification in progress). I clearly ran out of time to test 388. Any further thoughts on which direction to take? Both PRs are big, I don't immediately remember/know what the reasons are for one or the other. |
Sorry for dropping the ball on this, I'll try and see if I can wrap things up in the next week or so. |
…nse with background matching?
d1319d1
to
ba28206
Compare
my last commit adds a hack to solve this issue: radio-astro-tools/spectral-cube#900 (should've posted that here, perhaps, but it isn't obvious to me where it came from) |
This enables memmap'd output, which will be needed for very large (greater-than-memory) output cubes
EDIT: it also extends
reproject_and_coadd
into three dimensions