Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sionnaRT: slow GPU computation times #383

Open
bchateli opened this issue Apr 4, 2024 · 8 comments
Open

sionnaRT: slow GPU computation times #383

bchateli opened this issue Apr 4, 2024 · 8 comments

Comments

@bchateli
Copy link

bchateli commented Apr 4, 2024

Hi,

This a follow up to #283.

While 0.16.1 allowed to speed up computations on CPU, there is still a big difference between the execution on CPU and GPU.

For the code below, execution takes around 1s on CPU while it takes around 80s on GPU (RTX 3080). Those results are on 0.16.1 but similar behaviour has been encountered on 0.16.2.

#%%
import numpy as np

import sionna
from sionna.rt import load_scene, Transmitter, Receiver, PlanarArray


import time
#%%
scene = load_scene(sionna.rt.scene.etoile)
scene.frequency = 3.5e9

scene.tx_array = PlanarArray(num_rows=1,
                                num_cols=1,
                                vertical_spacing=0.5,
                                horizontal_spacing=0.5,
                                pattern="iso",
                                polarization="V")
        
scene.rx_array = PlanarArray(num_rows=1,
                                num_cols=1,
                                vertical_spacing=0.5,
                                horizontal_spacing=0.5,
                                pattern="iso",
                                polarization="V")

#Generate BS/UE locations
BS_pos = [-35,21.5,10]
tx = Transmitter(name="tx",position=BS_pos)
scene.add(tx)

L = 10
ue_locs = np.random.uniform(low=(-L/2+BS_pos[0], -L/2+BS_pos[1]), high=(L/2+BS_pos[0], L/2+BS_pos[1]), size=((100), 2))

ue_locs = np.stack([ue_locs[:,0],ue_locs[:,1],1.5*np.ones((ue_locs.shape[0]))],axis=1)

#Add UEs to scene
for idx in range(ue_locs.shape[0]):
    rx = Receiver(name=f'rx_{idx}',
                  position=ue_locs[idx,:],
                  orientation=[0,0,0])
    
    scene.add(rx)

#Path computation timing
start_time = time.time()
paths = scene.compute_paths(max_depth=3,
                            num_samples=int(1e6),
                            los=True,
                            reflection=True,
                            diffraction=False,
                            scattering=False,
                            check_scene=False)
print(f'Time elapsed: {time.time()-start_time} s')

Any idea on what is causing this? Thanks in advance!

@SebastianCa
Copy link
Collaborator

Hi @bchateli,
I have tried to reproduce the observed behavior using Sionna version 0.16.2 on my machine, but I wasn't able to see the same results.
Have you tried to run the code snippet in GoogleColab? Can you please tell us about your setup?

@bchateli
Copy link
Author

bchateli commented Apr 4, 2024

Hi @SebastianCa,

I just tried the snippet in GoogleColab and, like you, I am not able to reproduce the behaviour.

It must be related to my setup. I am running Sionna 0.16.1/2 on Windows 10, with Python 3.9.16 and Tensorflow 2.10.1. I tried more recent TF versions but I haven't been able to make it work on GPU.

@merlinND
Copy link
Collaborator

merlinND commented Apr 4, 2024

Hello @bchateli,

One idea to identify the source of the slowdown: if you fix all of the inputs (e.g. by fully controlling the randomness, or by setting some hardcoded values) and run the simulation multiple times, do the subsequent runs get faster?
We expect the very first run to be slower because there is a just-in-time compilation step, but after that the compiled kernel should be reused.

@bchateli
Copy link
Author

bchateli commented Apr 4, 2024

Hi @merlinND,

Thank for the idea, I will try and get back to you.

I compared my setup with the one in Collab and suspect it might related to the old TF version I am using (2.10.1 vs 2.15.0 for Collab). However, TF does not maintain GPU support on native Windows past 2.10 versions, so I'd have to test on WSL.

@bchateli
Copy link
Author

bchateli commented Apr 4, 2024

To get back to your suggestion @merlinND, I ran the simulation (on GPU) 3 times on the same console, and I got almost the same time for each run (plus/minus 1s)

@merlinND
Copy link
Collaborator

Hello @bchateli,

I am unable to reproduce on my machine as well. The GPU runtime is roughly 0.3-0.5 s on my machine, and CPU was ~0.5 s.

@bchateli
Copy link
Author

bchateli commented May 3, 2024

Hello @merlinND,

Thanks for the return. Did you run it on Linux or Windows ?

My hypothesis is that it is related to the old GPU-compatible TF version that I use on Windows. I tried to reproduce it on WSL, but so far, I haven't been able to make the GPU work with TF in WSL.

@merlinND
Copy link
Collaborator

merlinND commented May 3, 2024

Hello @bchateli,

I ran my test on Linux (Ubuntu 22.04).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants