Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when jit compiling #66434

Open
andremfreitas opened this issue Apr 25, 2024 · 0 comments
Open

Memory leak when jit compiling #66434

andremfreitas opened this issue Apr 25, 2024 · 0 comments
Assignees
Labels

Comments

@andremfreitas
Copy link

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

2.16

Custom code

Yes

OS platform and distribution

Linux Ubuntu 22.04.3 LTS

Mobile device

No response

Python version

3.10

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

I have a function that two functions that are jit compiled. I call them in my training loop function setp (which is not jit compiled, simply in graph mode). With these two functions jit compiled I get a registers spilled warning as follows:

2024-04-24 12:58:52.550355: I tensorflow/stream_executor/gpu/asm_compiler.cc:323] ptxas warning : Registers are spilled to local memory in function '__cuda_sm20_div_rn_f64_full', 8 bytes spill stores, 8 bytes spill loads ptxas warning : Registers are spilled to local memory in function '__cuda_sm20_div_rn_f64_full', 8 bytes spill stores, 8 bytes spill loads

This makes the training much slower.
Without jit compiling those two functions I don't get this warning.

My problem is similar to the one in this forum thread. In this thread some user redirects to this open issue #56423. However this one relates this problem to the distribution strategy. In my case I am just using one GPU. In this issue someone, mentioned that if they performed the optimizer step outside of the jit function it worked. In my case, as I said above, my training step function is not jit compiled and so the optimizer step is already not inside a jit compiled function.

Standalone code to reproduce the issue

Unfortunately, I cannot build a MWE at the moment.

Relevant log output

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants