Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugs] SMF panic on external (3rd party) CHF integration #562

Open
losdrugi opened this issue Apr 29, 2024 · 31 comments
Open

[Bugs] SMF panic on external (3rd party) CHF integration #562

losdrugi opened this issue Apr 29, 2024 · 31 comments
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@losdrugi
Copy link

Describe the bug

I'm trying to configure Free5gc to use CHF from the external provider. I succesfuly registered this CHF in NRF, but when I try to connect UE, the following errors occur (full logs also attached):

SMF:
2024-04-29T16:59:34.312541475+02:00 [INFO][SMF][PduSess] CHF Selection for SMContext SUPI[imsi-208930000000001] PDUSessionID[2]
2024-04-29T16:59:34.315972245+02:00 [INFO][SMF][Charging] Handle SendConvergedChargingRequest
2024-04-29T16:59:34.316362270+02:00 [ERRO][SMF][GIN] [Debugging] panic:
POST /nsmf-pdusession/v1/sm-contexts HTTP/2.0
(...callstack...)
2024-04-29T16:59:34.316934563+02:00 [INFO][SMF][GIN] | 500 | 127.0.0.1 | POST | /nsmf-pdusession/v1/sm-contexts |

AMF:
2024-04-29T16:59:34.192410890+02:00 [DEBU][AMF][Gmm][amf_ue_ngap_id:RU:1,AU:1(3GPP)][supi:SUPI:imsi-208930000000001] Search SMF from NRF[http://192.168.16.130:8000]
2024-04-29T16:59:34.256748241+02:00 [ERRO][AMF][Gmm][amf_ue_ngap_id:RU:1,AU:1(3GPP)][supi:SUPI:imsi-208930000000001] CreateSmContextRequest Error: undefined response type

This only happens with an external CHF, when Free5gc CHF is used, everything works ok.

Except NRF registartion request and watchdog requests I see no other communication between external CHF and Free5gc, so I think the issue is not related to the external CHF response format or behaviour.

To Reproduce

  1. start Free5gc modules, except of Free5gc-CHF (those are started on VM1: 192.168.16.130)
  2. start external CHF (started on VM2: 192.168.16.129). CHF is succesfuly registared in NRF
  3. start gNB (UERANSIM on VM2)
  4. start UE (UERANSIM on VM2) - this step causes described internal errors on F5gc side.

Environment (please complete the following information):

  • free5GC Version: v3.4.1
  • OS: Ubuntu Ubuntu 22.04.4 LTS (VM)
  • Kernel version: 6.5.0-28-generic
  • go version: go1.18.1 linux/amd64
  • c compiler version (Option): gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

Trace File

Configuration File

config_e500.zip

PCAP File

trace.zip

Log File

amf.log
smf.log

@andy89923
Copy link
Collaborator

Could you provide which CHF you are using?

@losdrugi
Copy link
Author

Commercial CHF from Comarch (https://www.comarch.com/telecommunications/bss-solutions/convergent-billing-system/) so not so easily available from outside (but if really needed, I could try to expose it to the external network).

@losdrugi
Copy link
Author

losdrugi commented Apr 30, 2024

I just found a valid testcase, that does not require any 3rd party software :-) You just need to run Free5gc CHF on a separate host (VM).

Steps to reproduce:

  1. Start Free5gc modules on VM1, except of CHF and Webconsole
  2. Configure Webconsole and CHF onf VM2 - the only configuration change is to set NRF address pointing NRF@VM1
  3. Start Webconsole and set the same SIM configuration as on VM1
  4. Start CHF@VM2 => this step throws an error on SMF@VM1:

2024-04-30T11:08:59.151342404+02:00 [INFO][SMF][PduSess] CHF Selection for SMContext SUPI[imsi-208930000000001] PDUSessionID[2]
2024-04-30T11:08:59.155172041+02:00 [INFO][SMF][Charging] Handle SendConvergedChargingRequest
2024-04-30T11:08:59.155463327+02:00 [ERRO][SMF][GIN] [Debugging] panic:
POST /nsmf-pdusession/v1/sm-contexts HTTP/2.0
Host: 127.0.0.2:8000
Accept: application/json
Accept-Encoding: gzip
Content-Length: 804
Content-Type: multipart/related; boundary="ed717142afaec39c3c12a321413b55e3fb820a2f64b850724f833917efb1"
User-Agent: OpenAPI-Generator/1.0.0/go

runtime error: invalid memory address or nil pointer dereference
goroutine 60 [running]:
runtime/debug.Stack()
/usr/lib/go-1.18/src/runtime/debug/stack.go:24 +0x65
github.com/free5gc/util/logger.ginRecover.func1.1()
/home/f5gc/go/pkg/mod/github.com/free5gc/util@v1.0.6/logger/logger.go:284 +0x1a5
panic({0xb4de20, 0x11ac2f0})
/usr/lib/go-1.18/src/runtime/panic.go:838 +0x207
github.com/free5gc/smf/internal/sbi/consumer.SendConvergedChargingRequest(0xc00053b600, 0x0, {0x0, 0x0, 0x0})
/home/f5gc/free5gc/NFs/smf/internal/sbi/consumer/converged_charging.go:94 +0x4e6
github.com/free5gc/smf/internal/sbi/producer.CreateChargingSession(0xc00053b600?)
/home/f5gc/free5gc/NFs/smf/internal/sbi/producer/charging_trigger.go:16 +0x26
github.com/free5gc/smf/internal/sbi/producer.HandlePDUSessionSMContextCreate(0x0, {0xc00053fb00, {0xc000129800, 0x15, 0x3e8}})
/home/f5gc/free5gc/NFs/smf/internal/sbi/producer/pdu_session.go:197 +0x1427

@andy89923
Copy link
Collaborator

andy89923 commented Apr 30, 2024

  1. Configure Webconsole and CHF on VM2 - the only configuration change is to set NRF address pointing NRF@VM1

Another thing that has to be configured is MongoDB. You have to change the MongoDB URL in chfcfg.yaml.
To the Mongo server, where is your VM1.

configuration:
  chfName: CHF # the name of this CHF
  sbi: # Service-based interface information
  nrfUri: http://192.168.16.130:8000 # a valid URI of NRF
  mongodb: # the mongodb connected by this CHF
    name: free5gc # name of the mongodb
    url: mongodb://192.168.16.130:27017 # a valid URL of the mongodb

However, I am still determining if this would solve the problem, but I will get back to you ASAP.

@losdrugi
Copy link
Author

losdrugi commented Apr 30, 2024

At VM2 (with CHF) I used a separate/local MongoDB (that's why there is a local IP for MongDB in my CHF config). I assumed that CHF does not need to share any data through the DB (SBI should be the only point of communication between CHF and SMF).

I started webconsole on VM2 to prepare the same set of data (SIM information) as I have at VM1.

This simulates quite well my real-live scenario, where CHF from different vendor uses only SBI to talk to NRF and SMF.

@andy89923
Copy link
Collaborator

In our Charging design, the ratingGroup will be allocated to each PDU Session in PCF, and written back to MongoDB.
CHF would use ratingGroup from DB to calculate the usage and do some work....

I am checking the spec whether this procedure follows the rule.

@losdrugi
Copy link
Author

losdrugi commented May 2, 2024

In my CHF rating groups must be directly configured. When we integrate CHF with customer network, we must agree rating group IDs that will be provided from the network and prepare configuration on our side (e.g. different charging rules per each RG). On the same way we need to provision SIM data for ech SIM card registered in customer network directly to CHF. If Free5gc CHF shares database with other network components it is a problem in real-life scenarios, where the network uses components from different vendors.

So for the target solution I think there should be a place in Free5GC configuration (GUI) where you can directly define rating group configuration (or at least display RGs automaticaly configured by F5GC). In this case I'd configure the same RG IDs in CHF with dedicated charging configurations per each group).

For now - is there any workaround possible? E.g. can I somehow directly insert those RGs to MongoDB?

@andy89923
Copy link
Collaborator

The rating group would be allocated by the PCF when the PDU Establishment procedure and the PCF would write the rating group to MongoDB. So, there is no way you want to use the rating group from MongoDB if you don't change the code in PCF.

However, your point about "there should be a place to define rating group directly" is correct.
We would consider and discuss how to implement or refactor the mechanism for allocating rating group.

@losdrugi
Copy link
Author

losdrugi commented May 3, 2024

Can you please decribe in more details how it works currently?

I assumed that it works this way:

  1. When you configure new SIM in Webconsole, you define network slices and flow rules for this SIM. I think this maps directly to RG configuration. This data is important For PCF, to know which RG shoud be assigned (based on which network slice & IP is used by SIM) and for CHF, to know which charging data (online/offline, quota, unit cost) should be used for a given RG. This data is stored in MongoDB
  2. When SIM connects to the network, PCF allocates RG (it uses MongoDB data for this)
  3. After RG is allocated, SMF should ask CHF for charging (directly through SBI/N40)
  4. CHF should recieve charging requst (SBI/N40) with the information which RG ID is used by SIM
  5. CHF should perform charging (based on data from CHF-side MongoDB) and reply to SMF through the same interface

At this moment I don't understand, why SMF crashes before sending any data to external CHF (if only SBI should be used between those components). So can you please clarify how it is implemented currently?

@andy89923
Copy link
Collaborator

andy89923 commented May 6, 2024

Sorry for the late reply 🙇

I looked deep into the log you provided from the first message; I didn't see that CHF registered to NRF.
So SMF can't find the CHF from NRF, which causes panic. (That is the SMF bug, which didn't check whether the CHF is nil or not).
Could you provide the external CHF config or double-check that the CHF from the external is working correctly?
Any logs/configs from our core (not only AMF/SMF) and external CHF would be very helpful.

@losdrugi
Copy link
Author

losdrugi commented May 6, 2024

Please check the trace I attached for HTTP at port 8000. This is our misconfiguration that our CHF uses HTTP instead of HTTP/2 for NRF registration. But besides of that communication looks correct. CHF issues bootstrap message (which is not recognized by your NRF, as this is from Rel 17 (not supported on your side yet), but later it starts normal registration, that is completed with success (also we set an annoying 1s watchdog which also is correctly supported by NRF, so later we have PATCH watchdog requests each second ;-) ). See attached screen:

image

But as I wrote in my post (Apr 30) - the same issue can be reproduced with Free5gc CHF installed on separate VM (along with the second instance of Webconsole - to create te same SIM config in MongoDB).

@andy89923
Copy link
Collaborator

Could you provide the CHF config?
The one you use in VM2 and our CHF.

@losdrugi
Copy link
Author

losdrugi commented May 6, 2024

chfcfg.zip

Nothing interesting there - only SBI and NRF IP addresses changed.

@andy89923
Copy link
Collaborator

I didn't see where is this packet, is it from another pcap file?

image

@losdrugi
Copy link
Author

losdrugi commented May 6, 2024

Those packets are filtered directly from trace.zip that was attached to my first post here. Maybe you need to change decoding manualy in Wireshark:

image

@andy89923
Copy link
Collaborator

Thanks for the reminder. I forgot to adjust the decode setting in Wireshark.

I discovered that the NFProfile from external CHF didn't contain apiPrefix in nfServices.
chf
And this is the sample NfProfile (PCF):
pcf

And the SMF would use apiPrefix to construct the client.
https://github.com/free5gc/smf/blob/bd173ab08579b3a060f619b0b300114e2f46dce8/internal/context/sm_context.go#L516C1-L516C2

	// Create Converged Charging Client for this SM Context
	for _, service := range *smContext.SelectedCHFProfile.NfServices {
		if service.ServiceName == models.ServiceName_NCHF_CONVERGEDCHARGING {
			ConvergedChargingConf := Nchf_ConvergedCharging.NewConfiguration()
			ConvergedChargingConf.SetBasePath(service.ApiPrefix)
			smContext.ChargingClient = Nchf_ConvergedCharging.NewAPIClient(ConvergedChargingConf)
		}
	}

These cause the SMF panic when connect with external CHF.

@andy89923
Copy link
Collaborator

I haven't tried to reproduce the use of our CHF in another VM, whether it would lead to the same question you have mentioned.
Do you have the PCAP file for this scenario?

@losdrugi
Copy link
Author

losdrugi commented May 6, 2024

Ok, I'll check this apiPrefix. But this does not explain crash with Free5gc CHF. I'll try to reproduce it once more with your CHF and attach all logs/traces.

@losdrugi
Copy link
Author

losdrugi commented May 6, 2024

Here are logs and traces from the test with Free5gs CHF running on second VM:

20240506.zip

I see that apiPrefix is set correctly by CHF:

image

But there is still the same crash on SMF side (see attached logs).

Tomorrow we'll try to add apiPrefix to our CHF, but it seems like this does not solve the issue...

@andy89923
Copy link
Collaborator

I tried to run our CHF on another VM, and everything work normally without any panic.
Here is my CHF config:

info:
  version: 1.0.3
  description: CHF initial local configuration

configuration:
  chfName: CHF # the name of this CHF
  sbi: # Service-based interface information
    scheme: http # the protocol for sbi (http or https)
    registerIPv4: <VM2> # IP used to register to NRF
    bindingIPv4: <VM2> # IP used to bind the service
    port: 8000 # port used to bind the service
    tls: # the local path of TLS key
      pem: cert/chf.pem # CHF TLS Certificate
      key: cert/chf.key # CHF TLS Private key
  nrfUri: http://<VM1>:8000 # a valid URI of NRF
  nrfCertPem: cert/nrf.pem # NRF Certificate
  serviceNameList: # the SBI services provided by this CHF, refer to TS 32.291
    - nchf-convergedcharging # Nchf_AMPolicyControl service
  mongodb: # the mongodb connected by this CHF
    name: free5gc # name of the mongodb
    url: mongodb://<VM1>:27017 # a valid URL of the mongodb
  quotaValidityTime: 10000
  volumeLimit: 50000
  volumeLimitPDU: 10000
  volumeThresholdRate: 0.8
  cgf:
    hostIPv4: <VM1>
    port: 2122
    listenPort: 2121
    tls:
      pem: cert/chf.pem
      key: cert/chf.key
    cdrFilePath: /tmp
  abmfDiameter:
    protocol: tcp
    hostIPv4: 127.0.0.113
    port: 3868
    tls:
      pem: cert/chf.pem
      key: cert/chf.key
  rfDiameter:
    protocol: tcp
    hostIPv4: 127.0.0.113
    port: 3869
    tls:
      pem: cert/chf.pem # CHF TLS Certificate
      key: cert/chf.key # CHF TLS Private key
logger: # log output setting
  enable: true # true or false
  level: info # how detailed to output, value: trace, debug, info, warn, error, fatal, panic
  reportCaller: false # enable the caller report or not, value: true or false

VM1 runs all NFs (without CHF) and MongoDB.
VM2 runs CHF only.
截圖 2024-05-07 上午11 39 24

@andy89923
Copy link
Collaborator

Could you provide all the config files of the free5GC core?
Also, I'm interested in external CHF. Could you share anything, like a config file or etc., through here or the email on my profile? Thanks. 🙏

@andy89923 andy89923 added the bug Something isn't working label May 7, 2024
@losdrugi
Copy link
Author

losdrugi commented May 7, 2024

Looks like there is some progress - after we added apiPrefix to NRF registration, now we get charging request from F5gc. Now our CHF responds with Bad Request, but I don't know yet why (maybe some misconfiguration on our side). I'll notify you as soon as i'll find something.

Regarding your CHF running on different VM - at first I had some crashes (as previously described) but then I dropped MongoDB and cleared/restarted everyghing and it helped. So I can confirm that CHF can run in this configuration without any issues.

@andy89923
Copy link
Collaborator

We've published the PCF (charging-related) document here.
I believe this could be useful for you and your team as a reference.

@losdrugi
Copy link
Author

losdrugi commented May 8, 2024

Thank you, I'll check. But my latest tests look very promising. I exchanged first CHF message between F5gc and Comarch CHF. There are still some errors, but it looks like configuration issues on our side, so I hope it'll be working soon. Definitely I'll share the final results with you :-)

@losdrugi
Copy link
Author

losdrugi commented May 9, 2024

Ok, nice to tell you, that we almost have success. I managed to confgure our CHF to exchange messages with Free5GC. At first I did not add test IMSI on CHF side, so it correctly communicated USER_UNKNOWN to the F5gc. So I added IMSI to CHF and it replied with an error, that the Rating Groups are uknown. So it looks like everything is ok on communication level, but the real issue is with Rating Groups.

So I need to directly configure Rating Groups - either on F5gc side, or my CHF side. I know that for this moment I cannot do this directly on F5gc side, but is there any way to check, which RG IDs will be used in communication with CHF (and what is the meaning of each RG)? I see that for online charging RGs 13 and 15 were used and for offline charging 11 and 12 - but are they generated somehow dynamically, or can I stick to those IDs (and configure it on my side?)

And the second (less important) question. Currently I registered in NRF only Converged Charging Service and all communication goes through this endpoint (both for online and offline traffic). But according to the specification, there is also Offline Only Sevice for CHF. Is it also supported by Free5gc? What would happen if i'd register it also in NRF?

@andy89923
Copy link
Collaborator

Is there any way to check, which RG IDs will be used in communication with CHF

  • The rating group would be allocated in PCF, as you can see in this article. So, the rating group could be retrieved from MongoDB if you want.
  • If you need to directly configure Rating Groups, I think you have to adjust PCF to not to allocate rating group but just use rating group from MongoDB.

there is also Offline Only Sevice

  • free5GC currently doesn't use any service from Offline Only Service in CHF, so there would be noting change if you put this service in NFProfile.

@losdrugi
Copy link
Author

losdrugi commented May 16, 2024

I checked how Rating Groups are allocated in PCF and I see a big issue with it, when thinking of it in terms of external rating/charging configuration. By definition, Rating Group should gather a set of services that have the same cost/rating type/rating rules. So I'd expect that whenever I'll add new IMSi to the Free5gc with the same confguration as others already defined, it shoud share the same Rating Group ID. But instead, I see that each new SIM gets a new set of Rating Groups.

E.g. I defined three SIMs, two with exactly the same configuration and one with an additional Flow Rule. Those SIMs do not use the same RG IDs for the same configs, but new RG IDs were created for each SIM.

With this approach it is not possible to configure any rating/charging rules per RG in the external Convergent Charging System (in target system I expect to have a lot of SIMs, but only a few RGs to be configured).

Would it be possible to have reusable/shared/configurable Rating Groups in next Free5gs releases? For now this is a blocker for me to move forward with any integration... :-(

image

@andy89923 andy89923 added the enhancement New feature or request label May 17, 2024
@andy89923
Copy link
Collaborator

Thanks for providing the information.
We plan to have configurable Rating Groups in a few months.

The next release will have some bug fixes and a minor enhancement, but not contain the configurable Rating Groups.

--

By definition, Rating Group should gather a set of services that have the same cost/rating type/rating rules.

Could you provide more details about what the rating group should share in your cases?

@losdrugi
Copy link
Author

On my side I only need to know Rating Group ID.

In real life scenarios each customer (network provider) has only a few Rating Groups configured (e.g. 2 - 10), depending on specific customer business needs. Each RG is related to some traffic classification, that need to be processed separately in rating/charging.

Some Rating Group examples used by our customers:

  • Free traffic from Selfcare Portal (end customer is not charged for Selfcare portal use, when he want to recharge some balance or to buy some addons)
  • Social or multimedia traffic (e.g. Facebook, Youtube, Spotify - this traffic can be rated with some special price, or be free of charge)
  • VoIP traffic
  • M2M traffic
  • test data
  • etc.

So in Free5gc I think a good place for Rating Groups classification would be SIM configuration panel and network slice / flow rule context. E.g. I could configure 1000 SIMs, each with 3 Flow Rules (same flow definition per each SIM) and for example 4 Rating Groups (1-3 related to traffic from each flow rule, and 4 for any other traffic). In this scenario I could prepare 4 charging configurations in CHF (e.g. if it is traffic from RG 1, this means that UE connected to 8.8.8.8/32 adress range (flow rule) and this traffic should be free of charge. Whilst other RG should be extra expensive, or has some limits on amount of data that can be used.

@andy89923
Copy link
Collaborator

We will discuss how to adjust the charging function in free5GC for future releases.

If we encounter any issues or have questions, we will post them here.
Your assistance would be greatly appreciated if you have the time to help us.🙏

Additionally, if you have any more questions, we are happy to discuss them with you.

@losdrugi
Copy link
Author

Sure, if you'll have any charging-related questions, you can put it here, write to me on LinkedIn, or we can even think of some conf-call. I'm still interested in integration between Comarch OCS/CHF and Free5gc, as I'd like to use Free5gc in our official 5G Lab (https://www.comarch.com/telecommunications/5g-lab). For now, Rating Groups configuration is a blocker, but it looks like we can solve it together.

BTW, I tried to do a simple workaround and manually updated RG IDs in policyData.ues.chargingData after they were created by PCF, but I see, that with each connection attempt, those ID are dynamically overwritten with new values (I hoped that PCF creates those IDs only once, so then I could update it and use my expected values). So for now I see no valid workaround for this issue and I must wait for changes on your side...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants