Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[defect]: Freeradius service crashes when use dhcp module #5316

Open
elchin9610 opened this issue Apr 22, 2024 · 4 comments
Open

[defect]: Freeradius service crashes when use dhcp module #5316

elchin9610 opened this issue Apr 22, 2024 · 4 comments
Labels
defect category: a defect or misbehaviour

Comments

@elchin9610
Copy link

What type of defect/bug is this?

Crash or memory corruption (segv, abort, etc...)

How can the issue be reproduced?

I used the freeradius module DHCP for my clients. My service crashes several times a day.

OS=Debian10
freeradius=3.0.26
database=oracle
clients=80k

At first, I received no error logs, so I thought it was either the database or the driver. Changing Oracle to Postgresql didn't the solve the problem.

Then decided to upgrade Debian10 to Debian12, also didn't the solve the problem.

When I use radiusd -X > /tmp/test.log there are no failures. As I understand it, this is because the service operates in a single-threaded mode

Finally, I rebuilt with the option: --with-experimental-modules --enable-developer, and received a log Error: ASSERT FAILED src/main/threads.c[679]: request->magic == REQUEST_MAGIC

My Config

cat radiusd.conf

thread pool {
start_servers = 50
max_servers = 200
min_spare_servers = 5
max_spare_servers = 15
}

cat mods-enabled/sql

sql {

    radius_db = "(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1518))(CONNECT_DATA=(SERVICE_NAME=RADIUS)))"

    pool {
           start = ${thread[pool].start_servers}
            min = ${thread[pool].min_spare_servers}
            max = ${thread[pool].max_servers}
            spare = ${thread[pool].max_spare_servers}
            uses = 0
            retry_delay = 30
            lifetime = 0
            idle_timeout = 60
     }

}

cat sites-enabled/dhcp

dhcp DHCP-Discover {

    update reply {
           &DHCP-Message-Type = DHCP-Offer
            &DHCP-Domain-Name-Server = 8.8.8.8
            &DHCP-Domain-Name-Server += 8.8.4.4
            &DHCP-Subnet-Mask = 255.255.255.0
            &DHCP-IP-Address-Lease-Time = 1800
            &DHCP-DHCP-Server-Identifier = "192.168.29.10"

            DHCP-Server-Host-Name = "DHCP"
            &DHCP-Domain-Name = "DHCP"
            &NAS-IP-Address = "%{DHCP-Gateway-IP-Address}"

            &DHCP-Your-IP-Address= "%{sql:select dhcp.get_available_ip('%{string:DHCP-Relay-Circuit-Id}', '%{DHCP-Client-Hardware-Address}') from dual}"
           &DHCP-Router-Address = `${raddbdir}/bin/get_dhcp_client_gw %{reply:DHCP-Your-IP-Address}`

}
}

dhcp DHCP-Request {

    update reply {
           &DHCP-Message-Type = DHCP-Ack
           &DHCP-Domain-Name-Server = 8.8.8.8
           &DHCP-Domain-Name-Server += 8.8.4.4
           &DHCP-IP-Address-Lease-Time = 1800
           &DHCP-Subnet-Mask = 255.255.255.0
           &DHCP-DHCP-Server-Identifier = "192.168.29.10"
           DHCP-Server-Host-Name = "DHCP"
           &DHCP-Domain-Name = "DHCP"
           &Calling-Station-Id = "%{DHCP-Client-Hardware-Address}"

           &DHCP-Router-Address = "%{sql:select NCNRADIUS.DHCP_REQUEST('%{string:DHCP-Relay-Circuit-Id}') from dual}"
           &DHCP-Router-Address = `${raddbdir}/bin/get_dhcp_client_gw %{reply:DHCP-Your-IP-Address}`


    }

}

Log output from the FreeRADIUS daemon

Mon Apr 22 13:21:21 2024 : Error: Received conflicting packet from client dhcp port 67 - ID: 0 due to unfinished request in module <queue>.  Giving up on old request.
Mon Apr 22 13:21:53 2024 : Error: Received conflicting packet from client dhcp port 54289 - ID: 0 due to unfinished request in module .  Giving up on old request.
Mon Apr 22 13:22:31 2024 : Error: Received conflicting packet from client dhcp port 67 - ID: 0 due to unfinished request in module <queue>.  Giving up on old request.
Mon Apr 22 13:22:32 2024 : Error: Received conflicting packet from client dhcp port 67 - ID: 0 due to unfinished request in module <queue>.  Giving up on old request.
Mon Apr 22 13:23:03 2024 : Error: Received conflicting packet from client dhcp port 67 - ID: 0 due to unfinished request in module <queue>.  Giving up on old request.
Mon Apr 22 13:23:05 2024 : Error: Received conflicting packet from client dhcp port 67 - ID: 0 due to unfinished request in module <queue>.  Giving up on old request.
Mon Apr 22 13:23:09 2024 : Error: Received conflicting packet from client dhcp port 67 - ID: 0 due to unfinished request in module <queue>.  Giving up on old request.
Mon Apr 22 13:23:09 2024 : Error: ASSERT FAILED src/main/threads.c[679]: request->magic == REQUEST_MAGIC
CAUGHT SIGNAL: Aborted
Backtrace of last 6 frames:
/opt/radius_3_0_26_multi_sql/lib/libfreeradius-radius.so(fr_fault+0x11d)[0x7fa74b7e1d53]
/opt/radius_3_0_26_multi_sql/lib/libfreeradius-server.so(rad_assert_fail+0x49)[0x7fa74b845429]
/opt/radius_3_0_26_multi_sql/sbin/radiusd(+0x43606)[0x5595ca54b606]
/opt/radius_3_0_26_multi_sql/sbin/radiusd(+0x438b6)[0x5595ca54b8b6]
/lib/x86_64-linux-gnu/libc.so.6(+0x89134)[0x7fa74b0a8134]
/lib/x86_64-linux-gnu/libc.so.6(+0x1097dc)[0x7fa74b1287dc]
No panic action set

Relevant log output from client utilities

No response

Backtrace from LLDB or GDB

No response

@elchin9610 elchin9610 added the defect category: a defect or misbehaviour label Apr 22, 2024
@alandekok
Copy link
Member

Mon Apr 22 13:21:21 2024 : Error: Received conflicting packet from client dhcp port 67 - ID: 0 due to unfinished request in module . Giving up on old request.
Mon Apr 22 13:21:53 2024 : Error: Received conflicting packet from client dhcp port 54289 - ID: 0 due to unfinished request in module . Giving up on old request.
Mon Apr 22 13:22:31 2024 : Error: Received conflicting packet from client dhcp port 67 - ID: 0 due to unfinished request in module . Giving up on old request.

This isn't a DHCP issue. The back-end is slow, and isn't responding to FreeRADIUS. But the client keeps sending DHCP packets, and eventually the server gets into a bad state, and crashes.

The simple fix is to ensure that the back-end is actually replying to the server.

A longer fix is to read the guidelines in the github issue template, and send over a full gdb back trace. We can then try to see what's going on.

But no amount of bug fixing in FreeRADIUS will make your SQL server faster. The first thing you should do is to find out why the SQL server is slow, and fix that.

@elchin9610
Copy link
Author

Thanks for your reply. To reduce the load on SQL , I use cache memcached. But the result is the same.
I'll try to use gdb back trace.

@alandekok
Copy link
Member

If memcache doesn't help, then the issue is elsewhere.

The default configuration has no issue replying to 10K packets/s. If your local configuration can't handle that, then something in your local configuration is breaking the server.

You will have to go through the configuration to see what has been changed from the defaults, and then find out which piece is taking a long time.

We can take a look at fixing the crash, but it won't have any meaningful change for your system. FreeRADIUS will still be blocked, and will be unable to respond to packets.

i.e. in practice, the difference between "down due to crash" and "down due to something blocking the server" is essentially zero.

Fix the underlying problem. Which is not the crash.

@elchin9610
Copy link
Author

Hello, this problem arose because I used &DHCP-Your-IP-Address= "%{sql:select dhcp.get_available_ip('%{string:DHCP-Relay-Circuit-Id}', '%{DHCP-Client-Hardware-Address}') from dual}"
Tonight I'll change to dhcp_sqlippool. Many thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect category: a defect or misbehaviour
Projects
None yet
Development

No branches or pull requests

2 participants