You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that the following line in def broadcast_inputs(x, y) triggerred a tensor storage copy that caused a CUDA memory overflow when I tried to run a small bundle adjustment dataset with 31843 pixel observations. Both reshape and contiguous could trigger a memory copy. If we can avoid memory copy in broadcast_inputs, we can avoid overflowing CUDA memory at this step.
The text was updated successfully, but these errors were encountered:
hxu296
changed the title
broadcast_inputs triggers tensor storange copy, peaks CUDA memory consumption
broadcast_inputs triggers tensor storage copy, peaks CUDA memory consumption
Jun 26, 2023
@hxu296 This is a historical issue, the functions were implemented by Cuda and the tensor copy behavior was expected. But now it seems not needed anymore. A similar function could be torch.broadcast_tensors and torch.broadcast_shapes, you may re-check those functions called after broadcast_inputs, if they are implemented by batch, then broadcast_inputs can be safely removed.
Summary
It seems that the following line in
def broadcast_inputs(x, y)
triggerred a tensor storage copy that caused a CUDA memory overflow when I tried to run a small bundle adjustment dataset with 31843 pixel observations. Bothreshape
andcontiguous
could trigger a memory copy. If we can avoid memory copy inbroadcast_inputs
, we can avoid overflowing CUDA memory at this step.pypose/pypose/lietensor/operation.py
Line 914 in 6598a84
Improvements
refactor
broadcast_inputs
to not use reshape and contiguous.Risks
TBD
Involved components
Optional: Intended side effects
TBD
Optional: Missing test coverage
TBD
The text was updated successfully, but these errors were encountered: