Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory-efficient NuFFT routine to predict model visibilities #229

Open
iancze opened this issue Dec 6, 2023 · 0 comments
Open

Memory-efficient NuFFT routine to predict model visibilities #229

iancze opened this issue Dec 6, 2023 · 0 comments

Comments

@iancze
Copy link
Collaborator

iancze commented Dec 6, 2023

Splits off NuFFT-relevant points from #224.

To address this, one should implement a NuFFT.predict(uu, vv) method that will return model visibilities for all uu and vv points, even if the uu and vv arrays would cause memory issues if tackled all at once.

A potential solution might be to use the length of uu and vv and some heuristic to estimate how large an individual batch can be and fit under reasonable memory limits (which are?). Then, the routine will split the original uu and vv arrays into these batches and predict their visibility values in batches. Then, the routine concatenates everything and returns it as if no one knew the wiser.

  • Should use appropriate PyTorch directives to remove autodiff calculation from these quantities.
  • Documentation needed to let the user know that this routine can't be used for backpropagation.

More info from #224:

If the user is doing any kind of optimization loop, then I think they will want to use something like NuFFT.forward in its current form so that autodiff can propagate derivatives efficiently. If their dataset is too large to fit in one call, then they will need to batch externally. SGD takes care of this, taking mini-batches of the dataset for several iterations within one training epoch.

If the user is trying to get all model visibilities at once, then they can use NuFFFT.forward_mem. But they won't be able to do any kind of optimization or autodiff with these, since I think that would bring the memory issue right back into play. So the only use cases I can think of for these model visibilities are

  • DirtyImager model and residuals
  • visualizing residual visibilities directly, like with 1D plots
    Since we can't use forward_mem for forward (in the sense of a back-propagating neural net layer), maybe it's better to call this method NuFFT.predict or something like that, and be clear in the docs that autodiff is disabled on this quantity.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant