Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

100% RAM usage for large batch simulation (simulation ends up CRASHING) #349

Open
pablosreyero opened this issue Mar 7, 2024 · 6 comments

Comments

@pablosreyero
Copy link
Contributor

Dear Sionna community,

I've been playing around with this tutorial: https://nvlabs.github.io/sionna/examples/Sionna_Ray_Tracing_Introduction.html for quite a while.

The thing is that after computing in a for loop: y = channel([x, h_freq, no]), b_hat = pusch_receiver([y, no]) , BER = compute_ber(b, b_hat).numpy() for several iterations, I run out of memory. The problem is that the RAM memory slowly fades away as the number of iterations increases and this forces the kernel to die, because I run out of memory. Therefore, If I set a batch of 100000 iterations, the simulation never gets to finish and we usually only reach 56000 iterations (approx)

I have noticed this behaviour not only for the functions mentioned above inside a for, but also for the Ray-Tracing part in a whole (also inside a for loop), i.e., load scene, deploy Tx and Rx, compute paths and CIRs.

I am running Sionna in a docker container, which is currently running in a machine with the following specifications:

  • Ubuntu 20.04 LTS
  • Intel® Xeon(R) Silver 4309Y CPU @ 2.80GHz × 32
  • 64 GB of RAM

Here's the code to reproduce this issue:

LongBatchCode.zip

Any feedback will be appreciated, thanks in advance!

Pablo.-

@merlinND
Copy link
Collaborator

merlinND commented Mar 13, 2024

Hello @pablosreyero,

Could you try structuring your code in the following way and let us know if the problem still occurs:

import gc

def iteration(...):
    y = channel([x, h_freq, no])
    b_hat = pusch_receiver([y, no]) 
    BER = compute_ber(b, b_hat)
    return BER.numpy()

def main():
    for it_i in range(...):
        BER_np = iteration(...)
        # Use BER_np as needed

        del BER_np
        gc.collect()

The key thing is that all per-iteration variables must go out of scope before calling the garbage collector.

If this works, you can call the garbage collector less often to reduce the overhead (e.g. once every 500 iterations).

@jhoydis
Copy link
Collaborator

jhoydis commented Mar 14, 2024

Hi,

I had a quick look at your code.

First of all, it seems that you want to simulate 1 single transmitter sending the same stream to 5 receivers.
However, you only configure a single PUSCHReceiver. So something is wrong in your setup. I have also some doubts that this is a typical PUSCH scenario.

Could it be that you actually want to simulate a distributed MIMO receiver? If this is the case, you would simply need to reshape the tensor of the channel frequency response from [batch_size, num_rx, num_rx_ant,...] to [batch_size, 1, num_rx* num_rx_ant,...].

This will probably not solve the memory issue. However, I would recommend that you run your simulations in graph mode. This might resolve it. In any case, it should substantially speed-up your simulations, even on CPU.

@pablosreyero
Copy link
Contributor Author

Hello @pablosreyero,

Could you try structuring your code in the following way and let us know if the problem still occurs:

import gc

def iteration(...):
    y = channel([x, h_freq, no])
    b_hat = pusch_receiver([y, no]) 
    BER = compute_ber(b, b_hat)
    return BER.numpy()

def main():
    for it_i in range(...):
        BER_np = iteration(...)
        # Use BER_np as needed

        del BER_np
        gc.collect()

The key thing is that all per-iteration variables must go out of scope before calling the garbage collector.

If this works, you can call the garbage collector less often to reduce the overhead (e.g. once every 500 iterations).

Hello @merlinND,

Thank you very much for your reply. We have already tested both the garbage collector and the del statement with multiple variables in the past, but it did not help out. However, I reproduced the exact same code structure you have provided, and I obtained the same results. I attach the new code and a log file with the evolution of the RAM (memory-wise) in the zip file.

Thanks again for your help.

Pablo.-

https://github.com/NVlabs/sionna/files/14603871/files.zip

@pablosreyero
Copy link
Contributor Author

Hi,

I had a quick look at your code.

First of all, it seems that you want to simulate 1 single transmitter sending the same stream to 5 receivers. However, you only configure a single PUSCHReceiver. So something is wrong in your setup. I have also some doubts that this is a typical PUSCH scenario.

Could it be that you actually want to simulate a distributed MIMO receiver? If this is the case, you would simply need to reshape the tensor of the channel frequency response from [batch_size, num_rx, num_rx_ant,...] to [batch_size, 1, num_rx* num_rx_ant,...].

This will probably not solve the memory issue. However, I would recommend that you run your simulations in graph mode. This might resolve it. In any case, it should substantially speed-up your simulations, even on CPU.

Hello @jhoydis,

Thanks for pointing out the error regarding my scenario, you are totally right, I rushed and copied an old version of my code to just reproduce the RAM issue in a smaller code. If I'm not mistaken you already mentioned this tensor reshape in another discussion (#269) and that's how I noticed, so thanks again for the reminder.

Now coming back to the RAM issue, I have tried everything: garbage collectors, del statements (at the end of every iteration), limit memory usage, convert .ipynb to .py and run the code from the terminal (without JupyterNotebook), analyze variables and objets with a python profiler, but nothing seems to unveil the problem. This MEM consumption is encountered when running simulations without a keras model and in CPU, and even though Sionna is meant to be run in a keras layer and in GPU, it is really weird to see how RAM slowly fades away, like if something was accumulating in memory.

Thanks for your help and for bringing the worlds of AI and Wireless Communications even closer together with Sionna!

Pablo.-

@jhoydis
Copy link
Collaborator

jhoydis commented Mar 14, 2024

Have you tried running your simulations in graph mode?

@pablosreyero
Copy link
Contributor Author

Have you tried running your simulations in graph mode?

Not yet, but I'm going to do so. I'll let you know if we encounter any other errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants