Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory efficient data input options #104

Open
iancze opened this issue Nov 18, 2022 · 0 comments
Open

Memory efficient data input options #104

iancze opened this issue Nov 18, 2022 · 0 comments
Labels
enhancement New feature or request

Comments

@iancze
Copy link
Collaborator

iancze commented Nov 18, 2022

Is your feature request related to a problem or opportunity? Please describe.
We were strict about our inputs to the Gridder object, expecting that uu and vv are measured in kilolambda and have shape (nchan, nvis), and weights have shape (nchan, nvis).

Describe the solution you'd like
This strictness may be a bit cumbersome, especially when working with ALMA spectral line datasets with a large number of channels. For example, the measurement set more efficiently stores the baselines in meters (so they are the same for every channel) and when the weights are the same for each channel, only one weight is stored per baseline. This means that on disk, uu, vv, and weights have shape (nvis) instead of (nchan, nvis). This can be a considerable memory saving when talking about large visibility datasets with hundreds or even thousands of channels.

Describe alternatives you've considered
At minimum, we could port convenience routines to convert these quantities from visread to MPoL, or just reference that they exist in visread. This might make life easier for the user in that they keep the filesize on disk small, but may still pose memory requirement issues when doing the inference.

A more advanced operation would be to adjust the Gridder, or an alternate class of Gridder, to take in the measurement set-like data products and then perform the gridding operation in a memory efficient manner. This could be helpful, but should only be worked on after we've done a proper memory profiling of a whole image synthesis procedure. It could be that the actual image optimization (and associated derivatives) are the largest bottleneck, anyway.

@iancze iancze added the enhancement New feature or request label Nov 18, 2022
@jeffjennings jeffjennings added this to the UML redesign milestone Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants