Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2 interface doesn't decode CONTROL request message correctly #468

Closed
HaoxinSEU opened this issue Feb 12, 2024 · 16 comments
Closed

E2 interface doesn't decode CONTROL request message correctly #468

HaoxinSEU opened this issue Feb 12, 2024 · 16 comments
Assignees

Comments

@HaoxinSEU
Copy link

HaoxinSEU commented Feb 12, 2024

Issue Description

Using the latest version of srsRAN and FlexRIC, I'm able to run the KPM xApp as described in the tutorial. However, when we tried to run the RC xApp xapp_oran_ctrl in FlexRIC, the gNB doesn't seem to decode the CONTROL request message correctly.

Setup Details

As written in the tutorial, we are using the branch br-flexric of FlexRIC, and also the latest version of srsRAN. We don't change the code on gNB side.

We change the slice_level_PRB_quota_param_id_e in xapp_oran_ctrl to skip ID 4, so it's aligned with the gNB and O-RAN spec.

Expected Behavior

We expect that the gNB can correctly decode the CONTROL request message from xApp / RIC.

Actual Behaviour

After digging into the code, we found that the SETUP request and response are correct.

But in lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp, when the gNB decodes the CONTROL request message from RIC in the function e2sm_rc_control_action_2_6_du_executor::convert_to_du_config_request() , it only finds RAN parameter ID 1, all other RAN parameter IDs are missing (including 11 and 12). Then, ctrl_cfg would be empty.

Therefore, in the function: e2sm_rc_control_action_2_6_du_executor::execute_ric_control_action(), it will return return_ctrl_failure(req). And finally the gNB sends the CONTROL failure message to RIC by the function: send_e2_ric_control_failure(e2_request, e2_response) in the file lib/e2/procedures/e2_ric_control_procedure.cpp.

Steps to reproduce the problem

The gNB config files is like this:

gnb_id: 147

slicing:
 - sst: 1
   sd: 0
   
amf:
  addr: 10.43.178.231
  bind_addr: 10.42.1.6

ru_sdr:
  device_driver: uhd
  device_args: 
  # clock:
  # sync:
  srate: 23.040
  otw_format: sc12
  tx_gain: 80
  rx_gain: 40

cell_cfg:
  dl_arfcn: 632628
  band: 78
  channel_bandwidth_MHz: 20
  common_scs: 30
  plmn: "00103"
  tac: 1
  pci: 1
  nof_antennas_ul: 1
  nof_antennas_dl: 1
  pdsch:
    mcs_table: qam256

log:
  filename: /tmp/gnb.log
  all_level: debug

pcap:
  e2ap_enable: true
  e2ap_filename: /tmp/gnb_e2ap.pcap
  
e2:
  enable_du_e2: true                # Enable DU E2 agent (one for each DU instance)
  e2sm_kpm_enabled: false            # Enable KPM service module
  e2sm_rc_enabled: true
  addr: 10.42.1.7                 # RIC IP address
  bind_addr: 10.42.1.6           # A local IP that the E2 agent binds to for traffic from the RIC. ONLY required if running the RIC on a separate machine. 
  port: 36421                       # RIC port

metrics: 
  rlc_json_enable: 1                # Enable RLC metrics reporting (need to deliver measurements to E2SM_KPM service model)
  rlc_report_period: 1000           # Set reporting period to 1s

FlexRIC config:

SM_DIR = "/usr/local/lib/flexric/"
Name = "NearRT_RIC"
NearRT_RIC_IP = "10.42.1.7"
E2_Port = 36421
E42_Port = 36422

Additional Information

As described in the discussion, @RobinWie also met this problem.

The captured E2 pcap is here:
e2ap.zip

@yagoda
Copy link
Contributor

yagoda commented Feb 12, 2024 via email

@HaoxinSEU
Copy link
Author

Here is the gNB log.

Inside it, we add one line in /lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp to print which RAN parameter ID is parsed:

for (auto& ran_p : ctrl_msg.ran_p_list) {
    logger.debug("[E2] CONVERT TO DU: Ran paramter ID: {}", ran_p.ran_param_id);
    if (action_params.find(ran_p.ran_param_id) != action_params.end()) {
      if (ran_p.ran_param_id == 11) {
        ctrl_config.min_prb_alloc = ran_p.ran_param_value_type.ran_p_choice_elem_true().ran_param_value.value_int();
        ctrl_config.ue_id         = ctrl_hdr.ue_id.gnb_du_ue_id().gnb_cu_ue_f1ap_id;
      } else if (ran_p.ran_param_id == 12) {
        ctrl_config.max_prb_alloc = ran_p.ran_param_value_type.ran_p_choice_elem_true().ran_param_value.value_int();
        ctrl_config.ue_id         = ctrl_hdr.ue_id.gnb_du_ue_id().gnb_cu_ue_f1ap_id;
      }
    } else {
      logger.error("Parameter not supported");
      return ctrl_config;
    }
  }

And we only get one line in the log:

2024-02-12T10:56:55.826686 [E2SM-RC ] [D] [E2] CONVERT TO DU: Ran paramter ID: 1

So it seems that only ID 1 is parsed.

gnb-e2.log

@yagoda
Copy link
Contributor

yagoda commented Feb 12, 2024

only parameters 11 & 12 are supported, what parameter are you trying to use?

@HaoxinSEU
Copy link
Author

I think the xApp uses all RAN parameters (as the gen_rrm_policy_ratio_group() function in flexric/examples/xApp/c/control/xapp_oran_ctrl.c), including 11 and 12.

From the gNB code, it seems that using more than 11 and 12 doesn't hurt, as in the for loop mentioned above they are just skipped. But now only ID 1 is parsed in the for loop, which is unexpected...

@HaoxinSEU
Copy link
Author

Hi,

I tried more things with this xApp. If I only keep parameter 12 in the control message in xApp and remove all other fields, the gNB can decode this ID.

But the type is decoded wrong, and thus the value of max_prb is wrong. Here is the log showing the decoded type ran_p.ran_param_value_type.ran_p_choice_elem_true() when using only this field:

2024-02-12T16:06:50.544768 [ASN1    ] [E] Invalid field access for choice type "RANParameter-Value" ("valueInt"!="valueOctS")

Here is how I build the control message with only one ID in FlexRIC's xapp_oran_ctrl.cpp:

e2sm_rc_ctrl_msg_frmt_1_t gen_rc_ctrl_msg_frmt_1_slice_level_PRB_quota()
{
  e2sm_rc_ctrl_msg_frmt_1_t dst = {0};
  dst.sz_ran_param = 1;
  dst.ran_param = calloc(1, sizeof(seq_ran_param_t));
  assert(dst.ran_param != NULL && "Memory exhausted");
  // gen_rrm_policy_ratio_list(&dst.ran_param[0]);
  dst.ran_param->ran_param_id = Max_PRB_Policy_Ratio_8_4_3_6;
  dst.ran_param->ran_param_val.type = ELEMENT_KEY_FLAG_FALSE_RAN_PARAMETER_VAL_TYPE;
  dst.ran_param->ran_param_val.flag_false = calloc(1, sizeof(ran_parameter_value_t));
  assert(dst.ran_param->ran_param_val.flag_false != NULL && "Memory exhausted");
  dst.ran_param->ran_param_val.flag_false->type = INTEGER_RAN_PARAMETER_VALUE;

  int max_prb = 25;
  dst.ran_param[0].ran_param_val.flag_false->int_ran = max_prb;

  return dst;
}

@yagoda
Copy link
Contributor

yagoda commented Feb 12, 2024

Hi Haoxin!

Thanks for your analysis, let us reproduce it on our end and get back to you!

Regards,
Justin

@HaoxinSEU
Copy link
Author

Another thing I notice is that in the O-RAN spec, parameter 11 and 12 should be of type "Key Flag False", and FlexRIC follows this rule.

But in the code: du_mac_sched_control_config e2sm_rc_control_action_2_6_du_executor::convert_to_du_config_request(const e2sm_ric_control_request& e2sm_req_), we are using ran_p_choice_elem_true() instead of its counterpart ran_p_choice_elem_false(), whose name sounds more reasonable as "false".

Will this be a potential problem?

image

Therefore, I tried to change the type in the xApp from ELEMENT_KEY_FLAG_FALSE_RAN_PARAMETER_VAL_TYPE to ELEMENT_KEY_FLAG_TRUE_RAN_PARAMETER_VAL_TYPE . And this time the value field is still not correctly decoded (integer -> boolean), while before the decoding is integer -> oct string

2024-02-13T13:26:01.595012 [ASN1    ] [E] Invalid field access for choice type "RANParameter-Value" ("valueInt"!="valueBoolean")

@pgawlowicz
Copy link
Collaborator

Hi, indeed, the asn1 encoding of e2sm_rc parameters is not working properly. We investigate this issue.

@yagoda yagoda self-assigned this Mar 18, 2024
@wdgarey
Copy link

wdgarey commented Mar 22, 2024

Hi, I also have this same issue and was able to repeat the steps. I see someone was assigned to work on this but is there a work around for the moment? Thanks!

@yc541
Copy link

yc541 commented Mar 28, 2024

We can repeat the same results as well. Something else I noticed while reading the code and trying to debug this, in e2sm_rc.h, within the definition of struct ran_param_value_c, when defining its private member c with choice_buffer_t, int64_t is not included. Similarly in e2sm_rc.cpp, in the definition of void ran_param_value_c::set, within case types::value_int, the code directly break without something like c.init<int64_t>(), I'm not sure if this is intentional or if it will cause any subsequent issues once the type is decoded correctly (to int of course).

@pgawlowicz
Copy link
Collaborator

pgawlowicz commented Apr 11, 2024

@HaoxinSEU, could you check again with the test branch? The issue is hopefully fixed with this commit:
24a5645

@HaoxinSEU
Copy link
Author

Hi @pgawlowicz,

Thanks for the information! Sorry but currently I'm working on another topic and may not have enough time to setup the test... Maybe other participants under this issue can give a hand on this.

But still much thanks for the kind help and the new code!

@wdgj
Copy link

wdgj commented Apr 24, 2024

I can confirm that with the latest commit (1483bda) which includes the changes from 24a5645, that the flexric and xapp no longer crash when decoding the E2 control message with the following patch applied to the flexric xapp from the br-flexric branch.

+++ b/examples/xApp/c/control/xapp_oran_ctrl.c
@@ -72,15 +72,15 @@ typedef enum {
     RRM_Policy_Ratio_List_8_4_3_6 = 1,
     RRM_Policy_Ratio_Group_8_4_3_6 = 2,
     RRM_Policy_8_4_3_6 = 3,
-    RRM_Policy_Member_List_8_4_3_6 = 4,
-    RRM_Policy_Member_8_4_3_6 = 5,
-    PLMN_Identity_8_4_3_6 = 6,
-    S_NSSAI_8_4_3_6 = 7,
-    SST_8_4_3_6 = 8,
-    SD_8_4_3_6 = 9,
-    Min_PRB_Policy_Ratio_8_4_3_6 = 10,
-    Max_PRB_Policy_Ratio_8_4_3_6 = 11,
-    Dedicated_PRB_Policy_Ratio_8_4_3_6 = 12,
+    RRM_Policy_Member_List_8_4_3_6 = 5,
+    RRM_Policy_Member_8_4_3_6 = 6,
+    PLMN_Identity_8_4_3_6 = 7,
+    S_NSSAI_8_4_3_6 = 8,
+    SST_8_4_3_6 = 9,
+    SD_8_4_3_6 = 10,
+    Min_PRB_Policy_Ratio_8_4_3_6 = 11,
+    Max_PRB_Policy_Ratio_8_4_3_6 = 12,
+    Dedicated_PRB_Policy_Ratio_8_4_3_6 = 13,
 } slice_level_PRB_quota_param_id_e;

static

However, if I also include the following changes to the xapp which should limit the PDSCH PRBs for UE 0 (rnti=x4601) to 5, it does not appear to take effect as shown in the attached gnb.log.gz.

+++ b/examples/xApp/c/control/xapp_oran_ctrl.c
@@ -167,7 +167,7 @@ void gen_rrm_policy_ratio_group(lst_ran_param_t* RRM_Policy_Ratio_Group,
   Min_PRB_Policy_Ratio->ran_param_val.flag_false = calloc(1, sizeof(ran_parameter_value_t));
   assert(Min_PRB_Policy_Ratio->ran_param_val.flag_false != NULL && "Memory exhausted");
   Min_PRB_Policy_Ratio->ran_param_val.flag_false->type = INTEGER_RAN_PARAMETER_VALUE;
-  int min_prb = 5; //TODO
+  int min_prb = 0; //TODO
   Min_PRB_Policy_Ratio->ran_param_val.flag_false->int_ran = min_prb;
   // Max PRB Policy Ratio, ELEMENT (RRM Policy Ratio Group -> Max PRB Policy Ratio)
   seq_ran_param_t* Max_PRB_Policy_Ratio = &RRM_Policy_Ratio_Group->ran_param_struct.ran_param_struct[2];
@@ -176,7 +176,7 @@ void gen_rrm_policy_ratio_group(lst_ran_param_t* RRM_Policy_Ratio_Group,
   Max_PRB_Policy_Ratio->ran_param_val.flag_false = calloc(1, sizeof(ran_parameter_value_t));
   assert(Max_PRB_Policy_Ratio->ran_param_val.flag_false != NULL && "Memory exhausted");
   Max_PRB_Policy_Ratio->ran_param_val.flag_false->type = INTEGER_RAN_PARAMETER_VALUE;

-  int max_prb = 51; //TODO
+  int max_prb = 5; //TODO
   Max_PRB_Policy_Ratio->ran_param_val.flag_false->int_ran = max_prb;
   // Dedicated PRB Policy Ratio, ELEMENT (RRM Policy Ratio Group -> Dedicated PRB Policy Ratio)
   seq_ran_param_t* Dedicated_PRB_Policy_Ratio = &RRM_Policy_Ratio_Group->ran_param_struct.ran_param_struct[3];

On this line req.max_prb_alloc.has_value() is evaluating to false when it should be true.

req.max_prb_alloc.has_value() ? req.max_prb_alloc.value() : MAX_NOF_PRBS};

I would also suggest the following changes to the srsRAN_Project code for the gnb log file information.

diff --git a/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp b/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp
index bac860f7e..b792837c4 100644
--- a/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp
+++ b/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp
@@ -74,10 +74,12 @@ void e2sm_rc_control_action_du_executor_base::parse_ran_parameter_value_false(
   control_config_params cfg;
   if (ran_param_id == 11) {
     cfg.min_prb_alloc = ran_p.ran_param_value.value_int();
+    logger.info("RAN parameter ID {} (Min PRB Allocation) with value {}", ran_param_id, ran_p.ran_param_value.value_int());
   } else if (ran_param_id == 12) {
     cfg.max_prb_alloc = ran_p.ran_param_value.value_int();
+    logger.info("RAN parameter ID {} (Max PRB Allocation) with value {}", ran_param_id, ran_p.ran_param_value.value_int());
   } else {
-    logger.error("Unknown RAN parameter ID %d", ran_param_id);
+    logger.warning("Unknown RAN parameter ID {}", ran_param_id);
     return;
   }
   ctrl_cfg.param_list.push_back(cfg);

@gckopper
Copy link

gckopper commented May 9, 2024

Hello, everyone! I've managed to spot the bug here and solved it using the following patch

diff --git a/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp b/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp
index bac860f7e..df0055662 100644
--- a/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp
+++ b/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp
@@ -71,16 +71,20 @@ void e2sm_rc_control_action_du_executor_base::parse_ran_parameter_value_false(
     uint64_t                                        ue_id,
     du_mac_sched_control_config&                    ctrl_cfg)
 {
-  control_config_params cfg;
+  if (ctrl_cfg.param_list.empty()) {
+    ctrl_cfg.param_list.push_back(control_config_params{});
+  }
+  control_config_params &cfg = ctrl_cfg.param_list[0];
   if (ran_param_id == 11) {
     cfg.min_prb_alloc = ran_p.ran_param_value.value_int();
+    logger.info("RAN parameter ID {} (Min PRB Allocation) with value {}", ran_param_id, ran_p.ran_param_value.value_int());
   } else if (ran_param_id == 12) {
     cfg.max_prb_alloc = ran_p.ran_param_value.value_int();
+    logger.info("RAN parameter ID {} (Max PRB Allocation) with value {}", ran_param_id, ran_p.ran_param_value.value_int());
   } else {
     logger.error("Unknown RAN parameter ID %d", ran_param_id);
     return;
   }
-  ctrl_cfg.param_list.push_back(cfg);
 }
 
 void e2sm_rc_control_action_du_executor_base::parse_ran_parameter_value(const ran_param_value_type_c& ran_param,

The issue is that in the current implementation each value parsed is placed in a new control_config_params object that is then pushed into a vector, however, the implementation in the class du_ue_ric_configuration_procedure requires both min and max PRB values to be present in the first object of the vector.

Keep in mind that the enum in the flexric xApp still needs to be fixed using the patch provided by wdgj in the comment above

Here is a screenshot of my test using srsue. That sharp drop in bitrate is the moment the xApp changed the PRB quote. The logs are attached in case someone is interested.
Screenshot from 2024-05-08 19-00-34
gnb.log

@wdgj
Copy link

wdgj commented May 10, 2024

@pgawlowicz @HaoxinSEU I can confirm that this is working with @gckopper changes. Thank you to everyone who helped with this!

Hello, everyone! I've managed to spot the bug here and solved it using the following patch

diff --git a/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp b/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp
index bac860f7e..df0055662 100644
--- a/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp
+++ b/lib/e2/e2sm/e2sm_rc/e2sm_rc_control_action_du_executor.cpp
@@ -71,16 +71,20 @@ void e2sm_rc_control_action_du_executor_base::parse_ran_parameter_value_false(
     uint64_t                                        ue_id,
     du_mac_sched_control_config&                    ctrl_cfg)
 {
-  control_config_params cfg;
+  if (ctrl_cfg.param_list.empty()) {
+    ctrl_cfg.param_list.push_back(control_config_params{});
+  }
+  control_config_params &cfg = ctrl_cfg.param_list[0];
   if (ran_param_id == 11) {
     cfg.min_prb_alloc = ran_p.ran_param_value.value_int();
+    logger.info("RAN parameter ID {} (Min PRB Allocation) with value {}", ran_param_id, ran_p.ran_param_value.value_int());
   } else if (ran_param_id == 12) {
     cfg.max_prb_alloc = ran_p.ran_param_value.value_int();
+    logger.info("RAN parameter ID {} (Max PRB Allocation) with value {}", ran_param_id, ran_p.ran_param_value.value_int());
   } else {
     logger.error("Unknown RAN parameter ID %d", ran_param_id);
     return;
   }
-  ctrl_cfg.param_list.push_back(cfg);
 }
 
 void e2sm_rc_control_action_du_executor_base::parse_ran_parameter_value(const ran_param_value_type_c& ran_param,

The issue is that in the current implementation each value parsed is placed in a new control_config_params object that is then pushed into a vector, however, the implementation in the class du_ue_ric_configuration_procedure requires both min and max PRB values to be present in the first object of the vector.

Keep in mind that the enum in the flexric xApp still needs to be fixed using the patch provided by wdgj in the comment above

Here is a screenshot of my test using srsue. That sharp drop in bitrate is the moment the xApp changed the PRB quote. The logs are attached in case someone is interested. Screenshot from 2024-05-08 19-00-34 gnb.log

@pgawlowicz
Copy link
Collaborator

Hi @gckopper, thanks for your interest in the project and for the patch. Indeed, this was an issue here. We are currently refactoring the code and will be releasing the updated version soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants