Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No stable connection between GNB and UE #412

Open
dhiaboujebha opened this issue Jan 18, 2024 · 26 comments
Open

No stable connection between GNB and UE #412

dhiaboujebha opened this issue Jan 18, 2024 · 26 comments

Comments

@dhiaboujebha
Copy link

dhiaboujebha commented Jan 18, 2024

Issue Description

When running the GNB and UE, an RRC connection is established, but it works sporadically. There is always no PDU session established, therefore no ip address is assigned to the UE. I get the same result with wireless and wired connection (using 30db cable attenuator).

Using previous versions of srsRAN_Project, srsRAN_4G and the same setup, we got a stable RRC connection and a PDU Session #269 .
Following the new instructions in the tutorial of srsRAN_Project (srate:23.04 and channel_bandwidth_Mhz: 20) the UE can not even find the GNB and build a connection.

  • We ran the performance script.
  • uhd_usrp_probe works on all used devices as it should
  • We also checked that the core is working by using UERANSIM.
  • Changing the Tx and Rx gains didn't fix the issue.

All configuration files and logs are attached below.

Setup Details

srsRAN_Project Commit: 0b2702c
srsRAN_4G Commit: eea87b1d8

UE:
B200mini
Ubuntu 20.04
UHD 3.15.0.0
Intel(R) Core(TM) i7-7700T CPU @ 2.90GHz
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1

GNB:
X310
Ubuntu 20.04
UHD 3.15.0.0
Intel(R) Core(TM) i7-7700T CPU @ 2.90GHz
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1

Core:
Open5GS v2.6.6
Ubuntu 20.04
Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1

Expected Behavior

PDU session establishment and ip address asignment like in the tutorial
275524024-55014b3b-dbb3-470c-b31a-55d8a4c93a1c

Actual Behaviour

Core is starting and connecting to AMF, UE is starting and Connecting to the GNB, the connection lasts less than 1 second and then it directly gets lost, also there is no PDU session established. In the Core no registration of the UE can be seen.
ue_V23 11_rrc_connected

Steps to reproduce the problem

As stated above, UE, GNB and Core run on seperate machines, the config files can be found below.
We are using the srsRAN_Project GNB and the srsRAN_4G UE.

ue.zip
core(2).zip
gnb.zip

Additional Information

[Any additional information, configuration or data that might be necessary to reproduce the issue]

@pgawlowicz
Copy link
Collaborator

PDSCH seems to be ok, but there is a lot of CRC=KO reported in gnb log for PUSCH.

Please try to tune the time_adv_nsamples parameter in the srsUE config, for example
time_adv_nsamples = 300

@dhiaboujebha
Copy link
Author

@pgawlowicz we have tried to change the time_adv_nsample several times between 20 and 300, but it didn't solve the problem.

@pgawlowicz
Copy link
Collaborator

pgawlowicz commented Jan 18, 2024

hmm, so if you go back to BW=10MHz it works and with 20MHz it does not work?

@dhiaboujebha
Copy link
Author

@pgawlowicz in both cases it doesn't work, but with BW=10MHz we get RRC Connection for few seconds. With BW=20MHz we don't get an RRC connection at all.

@pgawlowicz
Copy link
Collaborator

could you connect both gnb and ue USPRs to the same clock source?

@dhiaboujebha
Copy link
Author

@pgawlowicz We are sadly In a room that is almost shielded by metal. We do not have the capabilities yet to sync them in this room. However we might be able to try connecting them to the Leo Bodnar clock this afternoon.

What is confusing to us, is that the RRC connection was really stable in the mentioned case before #269 and now it is not anymore.

@pgawlowicz
Copy link
Collaborator

pgawlowicz commented Jan 18, 2024

Could you revert to the previous srsUE release (commit: fa56836) and check?
also would be good to test with the previous gNB release, do you remember which one was used back then?

@Rinelli96
Copy link

Hi @dhiaboujebha, we had the same error yesterday. Did you updated the open5gs core network? If it is the case, check the config files, their structure have been changed. In particular the /etc/open5gs/nrf.yaml config has been added.

We still have some connection issues, but fine-tuning the time_adv_nsamples parameter in the srsue_conf file allowed us to connect the two radios. The connection does not seem to be stable; we establish the PDU session, obtaining the UE IP address, but after a few seconds, the connection appears to be lost.

@dhiaboujebha
Copy link
Author

@Rinelli96 Thanks for your comment.
No, we have not updated the Open5GS Core.
Can you please attach the config files that you changed? Can you tell me also which release of Open5GS, srsRAN_project and srsRAN_4G are you using?

@Rinelli96
Copy link

open5gs verions: 2.7.0
srsRAN_4G version: commit eea87b1d8
srsRAN_Project version: commit 0b2702c
Open5GS Configs.zip

BR

@dhiaboujebha
Copy link
Author

Could you revert to the previous srsUE release (commit: fa56836) and check? also would be good to test with the previous gNB release, do you remember which one was used back then?

@pgawlowicz sorry for responding a bit late.
We actually tried to use srsUE release (commit: fa56836) withe the srsRAN_Project commit 0b2702, but we still face the same issue: RRC Connected for only 1 second and then Scheduling request failed.
We would try next to connect both SDRs to the same clock source.
If you have any further advice, it would be so helpful.

@dhiaboujebha
Copy link
Author

The problem still persists. So we switched to another setup, where we deployed the hardware devices with an external clock. Also in the new one we are facing some problems #442 .
I'll close this issue, as we are not working on this setup anymore.
Thanks for your help :)

@dhiaboujebha
Copy link
Author

dhiaboujebha commented Feb 21, 2024

I tried to analyze the logs of the gnb and the ue, so i've found out that the Scheduling Request failure that we get directly after the RRC Connected message is because of this:

2024-02-21T09:32:45.194542 [RRC ] [W] ue=0 "RRC Setup Procedure" timed out after 720ms
2024-02-21T09:32:45.314596 [DU-MNG ] [I] ue=0 proc="UE Delete": Procedure started....
2024-02-21T09:32:45.314831 [DU-F1 ] [I] ue=0 c-rnti=0x4608 GNB-DU-UE-F1AP-ID=0 GNB-CU-UE-F1AP-ID=0: F1 UE context removed.
2024-02-21T09:32:45.316110 [DU-MNG ] [I] ue=0 proc="UE Delete": Procedure finished successfully.
2024-02-21T09:32:45.316385 [CU-UEMNG] [I] ue=0 removed
2024-02-21T09:32:45.319649 [UL-PHY1 ] [I] [ 551.9] PUCCH: rnti=0x4608 format=1 prb1=105 prb2=na symb=[0, 14) cs=6 occ=4 sr=no t=159.0us
2024-02-21T09:32:45.319786 [MAC ] [I] [ 552.4] rnti=17928: Discarding UCI PDU. Cause: No UE with provided RNTI exists.

Here are the gnb.log and the ue.log.

ue_b200mini.conf.txt
gnb_x310_v2_15_02.yml.txt

Any explanation?
Thank you in advance

@dhiaboujebha dhiaboujebha reopened this Feb 21, 2024
@frankist
Copy link
Contributor

I see the rrc Setup message not being ACKed:

2024-02-21T09:32:44.526527 [SCHED   ] [I] [   473.1] DL HARQ rnti=0x4608 cell=0 h_id=0: Discarding HARQ process tb=0 with tbs=325. Cause: Maximum number of reTxs 4 exceeded

And no rrcSetupComplete is ever sent back to the UE. It is strange, because I can see the UE sending back positive CSI reports:

2024-02-21T09:32:44.546661 [UL-PHY1 ] [I] [   474.6] PUCCH: rnti=0x4608 format=2 prb=[104, 105) prb2=na symb=[10, 12) csi1=1111 t=168.9us

So, it seems that the UE is struggling with PUCCH Format 1.

@dhiaboujebha
Copy link
Author

@frankist Thanks for your comment!
After making some changes in the setup, it has finally worked, but the behavior is always sporadic! I will list the changes made and the configs used in our setup. Both hardware devices are connected with 2 antennas, no clock or a cable attenuator.
Here is the working setup configurations:

working_setup_X310_B200mini.zip

Open5GS Core: v2.7.0-86-g41d8934
srsRAN_Project: commit 0b2702c
srsRAN-4G: 23.11 commit ec29b0c1f
UHD: 3.15.0.0-2build5
Ubuntu: 20.04

Steps to get this result:

  • I tried different combinations of software versions:
  1. (Open5GS : v2.6.6-26-ge12b1be / srsRAN_4G : release 23_04_1 fa5683 / srsRAN_Project : 0b2702)
  2. (Open5GS : v2.6.6-26-ge12b1be / srsRAN_4G : release 23_04_1 eea87b1d8 / srsRAN_Project : 0b2702)
  3. (Open5GS : v2.6.6-26-ge12b1be / srsRAN_4G : release 23_04_1 eea87b1d8 / srsRAN_Project : 374200d)
    All those tries gave me the same results: RRC Connected for less than 1 second and then scheduling request failed.

=> Then i updated the core to v2.7.0 and changed the USRP X310 with another one (i think it's an older device), and it worked.

Trying 3* X310 with the same configuration (with adjusting the addr in the device_agr) but only one of them worked (the one with the address 192.168.11.2) and 2* B200mini and only one of them worked. Do you have any explanation for that?

The connection now lasts between 10 seconds and ~ 1 minute. The next challenge is to make it stable and reliable. What can i change to get a stronger connection?
✌️

@frankist
Copy link
Contributor

Hey, your latest logs show a slightly different story. I see many underflows in the gnb log (e.g. [W] Real-time failure in RF: underflow or real-time failure in RF: underflow) that is making the UE lose sync. Can you assign more cores to your gnb application (4 is a bit too low) and increase the number of PHY threads?

@dhiaboujebha
Copy link
Author

We only have 4 Cores available, as you can see here in the specs:
UE:
B200mini
Ubuntu 20.04
UHD 3.15.0.0
Intel(R) Core(TM) i7-7700T CPU @ 2.90GHz
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1

GNB:
X310
Ubuntu 20.04
UHD 3.15.0.0
Intel(R) Core(TM) i7-7700T CPU @ 2.90GHz
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1

In the beginning we also thought our problems might be related to the limited physical resources. But we were told it is most likely not a problem.

          @tilldroemmer, double-check that your core and UE configurations match. Normally you see this type of issue when there is a problem with the APN configuration, or how the UE is registered in the core. I would be surprised if it has to do with the available resources.

Originally posted by @brendan-mcauliffe in #269 (comment)

We neither could find any minimal resource requirements in the specs. Can you help us with the resource requirements?

@pgawlowicz
Copy link
Collaborator

We are working on HW recommendations. Something similar to what we have for srsRAN-4G project: https://docs.srsran.com/projects/4g/en/latest/app_notes/source/hw_packs/source/index.html

Regarding your setup, the 4 cores seem to be not enough. By default, we use 8 threads - please see the expert_execution section in the gnb config reference.

You might try to reduce BW and then add the following section to your gnb config:

expert_execution:
  threads: 
    non_rt: 
      nof_non_rt_threads: 1                   # Optional UINT (4). Sets the number of non real time threads for processing of CP and UP data in upper layers. 
    upper_phy: 
      pdsch_processor_type: auto              # Optional TEXT (auto). Sets the PDSCH processor type. Supported: [auto, generic, concurrent, lite].
      nof_pusch_decoder_threads: 1            # Optional UINT (1). Sets the number of threads used to encode PUSCH.
      nof_ul_threads: 1                       # Optional UINT (1). Sets the number of upprt PHY threads to proccess uplink.
      nof_dl_threads: 1                       # Optional UINT (1). Sets the number of upprt PHY threads to proccess downlink.  

But still, I am not sure whether this setup will work correctly.

@dhiaboujebha
Copy link
Author

@pgawlowicz we upgraded the gNB computer to a robuster one with 8 cores. After that we got a better behavior of the setup, but it is still not reliable, because it takes some tries to establish a PDU session, but we always can get an RRC connection. Yesterday we installed a new network card on the gNB computer to connect the X310 with a 10Gbit to the computer. So now we use the full MTU size 8000. After some tests this doesn't seem to bring a big difference.

@pgawlowicz
Copy link
Collaborator

@dhiaboujebha could you provide the output of the console trace from both the gnb and srsUE? You need to press t in both consoles to activate trace logging.

You might be also interested in checking this discussion and it was a similar issue: #489 (comment)

@dhiaboujebha
Copy link
Author

@pgawlowicz here are the traces of the gnb and the ue.
trace_ue_21_03.txt
trace_gnb_21_03.txt
The PDU session lasts for long period now. The only problem that we are facing is the first session, which is hard to establish.

@pgawlowicz
Copy link
Collaborator

hmm, could you try to reduce the TX gain of the gnb and try again? It seems that the signal received at UE is very good (snr>30dB). But the signal from UE at gnb is bad. Then you might try to increase srsUE tx_gain

@dhiaboujebha
Copy link
Author

@pgawlowicz we tried to reduce the Tx gains of the gNB and we kept the Tx gains of the UE at the maximum. Here you can find the traces of the tests done.
tx_10.zip
tx_15.zip
tx_20.zip

@pgawlowicz
Copy link
Collaborator

PDSCH SNR around 20dB looks good. Now you need to also improve uplink. I have seen that the rx_gain in gnb is already 30 and tx_gain in srsUE is 80. Maybe it is already too much and the link is saturated. Could you try to reduce both a bit and see if PUSCH SNR increases?

@pgawlowicz
Copy link
Collaborator

@dhiaboujebha any update on this issue?

@dhiaboujebha
Copy link
Author

dhiaboujebha commented Apr 16, 2024

@pgawlowicz sorry for the delay. We made some changes in the tx gains of ue and rx gains of gnb as you said. Here are the traces:
traces_tx_70_rx_25.zip
traces_tx_70_rx_20.zip
traces_tx_60_rx_20.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants