Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of Bug Reports for the mt7921au chipset / mt7921u driver... #107

Open
morrownr opened this issue Sep 6, 2022 · 247 comments
Open

List of Bug Reports for the mt7921au chipset / mt7921u driver... #107

morrownr opened this issue Sep 6, 2022 · 247 comments

Comments

@morrownr
Copy link
Owner

morrownr commented Sep 6, 2022

This issue is for maintaining a list of problematic issues that need work. This list will be maintained and updated in this first post by @morrownr . Please add posts to this issue as you have updated information for the existing BUGs in the list or if you have information about a new BUG. Thank you.

Dear Mediatek devs... help is appreciated.


Bug: (2024-04-18) See: #392 . WDS/4addr not supported in AP mode. First reported with Alfa AXML adapter that uses the mt7921au chipset and mt7921u driver). The OP is unable to use WDS/4addr in AP mode.

Status: Open

Info: It was reported that this capability does work with an adapter that uses the mt7612u chipset/driver.


Bug: (2024-03-26) See: #378 Wifi adapter not showing up. First reported with Alfa AXML adapter that uses the mt7921au chipset and mt7921u driver). The adapter is non-functional until using the workaround below.

Status: Open

Workaround: the workaround is to run modprobe -r btusb first, then plug in the usb wifi adapter.

More input is needed. Is this a problem with btusb?


Bug: (2023-12-22) Many Linux distros are detecting Bluetooth capability in mt7921au based adapters but none of the adapters on the market have Bluetooth turned on so it won't work. Linux should not be detecting Bluetooth capability when it is actually not available.

Status: Open and ongoing

Here is a link to a location where you can get a copy of the Intel White Paper that explains the details of why USB3 capable WiFi adapters should not have Bluetooth capability turned on:

https://www.usb.org/document-library/usb-30-radio-frequency-interference-impact-24-ghz-wireless-devices

USB3 WiFi adapters should not have Bluetooth turned on as the USB3 will cause interference with Bluetooth. If makers decide they really want Bluetooth capability in an adapter then they need to limit wifi to USB2 capability. All adapters with the mt7921au chipset that I am aware of have Bluetooth turned off so WiFi can operate in USB3 mode. However, there is a bug in that Bluetooth capability is still being detected by Linux distros and the driver/firmware is loading. Systems act like Bluetooth is available but when you try to use the Bluetooth, it won't work. It is not clear to me how this can be fixed but it really does need to be fixed.

This is not a problem with PCIe cards. I have a mt7922 based PCIe card. Wifi and Bluetooth work well together because wifi uses the PCIe bus and not USB. Please understand that issue in this bug is not exclusive to this chipset. This is an issue will all USB WiFi adapters. The adapters that have USB wifi capability and BT capabilities over the years have limited USB to USB2 to avoid the problem of interference.


Bug: (2023-12-07) Active monitor mode breaks driver.

Status: open

Reporter: @ZerBea
Link: openwrt/mt76#839
Problem: Using Active Monitor mode breaks the driver

Driver reports that active monitor mode is possible:

$ iw list | grep active
Device supports active monitor (which will ACK incoming frames)

But if hcxdumptool set active monitor mode, it stops working.

If active monitor mode is disabled, everything's fine

0 ERROR(s) during runtime
638 Packet(s) captured by kernel
0 Packet(s) dropped by kernel
1 SHB written to pcapng dumpfile
1 IDB written to pcapng dumpfile
1 ECB written to pcapng dumpfile
83 EPB written to pcapng dumpfile

exit on sigterm
I don't think the problem is related to hcxdumptool, because it can be reproduced with iw, ip link and tshark, too:

$ sudo ip link set wlp22s0f0u4i3 down
$ sudo iw dev wlp22s0f0u4i3 set type monitor
$ sudo ip link set wlp22s0f0u4i3 up
$ tsahrk -i wlp22s0f0u4i3
22 packets captured

$ sudo ip link set wlp22s0f0u4i3 down
$ sudo iw dev wlp22s0f0u4i3 set monitor active
$ sudo ip link set wlp22s0f0u4i3 up
$ tshark -i wlp22s0f0u4i3
Capturing on 'wlp22s0f0u4i3'
^C
0 packets captured

Background:
Running active monitor mode, the device ACK incoming frames addressed to the virtual MAC of the device.
This feature is really useful to perform PMKID attacks.
At the moment, active monitor mode is working on:

mt76x0u
mt76x2u

It is not working on:

mt7601u
mt7921u

I see two options:
active monitor mode should be fixed, or
active monitor mode capability should not be reported by the driver

mt7601u
$ iw list | grep active
Device supports active monitor (which will ACK incoming frames)

mt7921u
$ iw list | grep active
Device supports active monitor (which will ACK incoming frames)


Bug: LED does not function in several of the usb wifi adapters that use the mt7921au chipset.

Status: open, it is unclear what the problem is.

Reported by @morrownr
Confirmed by numerous users.


Bug: AP Mode DFS (5 GHz) support is non-functional
Status: open

Reported by @morrownr
Confirmed by numerous users.

This is really a serious omission in that in many places in the world there are limited non-DFS channels available leading to high levels of congestion.

Dear Mediatek, does your usb chipset competitor support DFS channels in AP Mode? Yes they do. See: out-of-kernel drivers for rtl8812au, rtl8811au, rtl8812bu and rtl8811cu. You need to think about this. Sincerely.


Bug: txpower reading is showing as unusually low as in 3 dBm using iw.
Status: open

Reported by several individuals.

This reading must be wrong because actual usage suggests the reading should be much higher.


Bug: (feature request) mt7921u driver does not support 2 interfaces of AP mode on one adapter
Status: open

Reported by @whitslack

mt7921u driver does not support 2 instances of AP mode whereas this was common on some drivers for older adapters.

Now:

valid interface combinations:

	 * #{ managed, P2P-client } <= 2, #{ AP, P2P-GO } <= 1,
	   total <= 2, #channels <= 2

What we want:

valid interface combinations:

	 * #{ managed, P2P-client } <= 2, #{ AP, P2P-GO } <= 2,
	   total <= 2, #channels <= 2

Bug: connection is dropped and the only way to correct the situation is to reboot (AP mode)
Status: open

Testing to see if SG helps performance:

scatter-gather test with mt7921au based adapter

Issue: connection drops and the only resolution is to reboot the system.

Raspberry Pi 4B
RasPiOS 2023-05-03

I changed the modulate parameter and rebooted between each test so as to alternate on and off.

iperf3 -c 192.168.1.1 -t 300

scatter-gather off (disable_usb_sg=1)

1:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-300.00 sec  19.9 GBytes   569 Mbits/sec    4             sender
[  5]   0.00-300.01 sec  19.9 GBytes   569 Mbits/sec                  receiver

2: 
[  5]   0.00-300.00 sec  19.9 GBytes   570 Mbits/sec    5             sender
[  5]   0.00-300.01 sec  19.9 GBytes   570 Mbits/sec                  receiver

3:
[  5]   0.00-300.00 sec  20.0 GBytes   573 Mbits/sec    2             sender
[  5]   0.00-300.01 sec  20.0 GBytes   573 Mbits/sec                  receiver

scatter-gather on (disable_usb_sg=0)

1:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-300.00 sec  19.9 GBytes   570 Mbits/sec    1             sender
[  5]   0.00-300.01 sec  19.9 GBytes   570 Mbits/sec                  receiver

2:
[  5]   0.00-300.00 sec  20.0 GBytes   572 Mbits/sec   48             sender
[  5]   0.00-300.01 sec  20.0 GBytes   572 Mbits/sec                  receiver


3.
[  5]   0.00-300.00 sec  19.9 GBytes   571 Mbits/sec    0             sender
[  5]   0.00-300.02 sec  19.9 GBytes   571 Mbits/sec                  receiver

Observation: So much for needing to average the results. I was careful
to check that sg was on or off. I have no explanation for how the results
could be so close. I see no evidence that sg is providing any performance
increase.

Previous to this testing session, I have been able to see the issue of
the connection being dropped and only a reboot will connect the
situation. It happened twice a few days ago while testing with sg on.
There is a history of this with mt7612u adapters. I have yet to duplicate
the issue with sg off.

Conclusion: Further testing on different platforms is needed. I will test
x86_64 next. Given the history of sg causing problems such as connections
dropping that can only be corrected with a reboot, it may be better for the
default to be disable_usb_sg=1 with a follow up to determine what the
problem is.


@deren
Copy link

deren commented Sep 15, 2022

Hi @morrownr

I cannot reproduce the mess system log in in 6.0-rc3. Can you please show me full log & reproduce steps(if any special)?

Thanks,
Deren

@morrownr
Copy link
Owner Author

Hi @deren

I will retest on the original system plus two additional systems and report as soon as I can.

Regards,

Nick

@morrownr
Copy link
Owner Author

Hi @deren

I have numbered the bugs so as to make it easier to reference them. The bug in question is now called Bug 2.

I have withdrawn Bug 2. After additional testing on different hardware, it appears this bug may be unique to that system so I need to back up and see if I can isolate the cause.

Have you been able to duplicate Bug 1?

Regards,

Nick

@deren
Copy link

deren commented Sep 16, 2022

Hi @morrownr

I cannot reproduce Bug2, either. The BT function always working properly, even if plug&play or reload driver several times.

But the log is weird to me, it looks like missing fw file on filesystem. Can you please check this problem related to any specific device?
[ 149.874493] bluetooth hci0: Direct firmware load for mediatek/BT_RAM_CODE_MT7961_1_2_hdr.bin failed with error -2

Regards,
Deren

@morrownr
Copy link
Owner Author

Hi @deren

The BT function always working properly, even if plug&play or reload driver several times.

I have a laptop computer that uses a wifi card based on the mt7921 chipset. Bluetooth works well. However, the subject here is about a usb wifi adapter that uses the mt7921au chipset. I have found no evidence that this adapter supports bluetooth. Is the capability turned off in hardware? I don't know but am trying to find out.

But the log is weird to me

What I posted is not information from a single log. The first section shows the log entries when the most up to date BT firmware is installed. The second section shows the log entries when I delete the BT firmware and reboot.

The first section of the log with the firmware installed makes me think the driver is trying to bring BT up but it is unable due to the lack of hardware. Bluetooth support is rare on usb wifi adapters. Could it be that the driver, mt7921u, is making an incorrect assumption that BT hardware is there to use?

My mt7921au based usb wifi adapter is showing no sign of support for BT whether the firmware is install or not. This adapter is a Comfast CF-951AX.

Regards,

Nick

@deren
Copy link

deren commented Sep 17, 2022

Hi @morrownr

What I posted is not information from a single log. The first section shows the log entries when the most up to date BT firmware is installed. The second section shows the log entries when I delete the BT firmware and reboot.

Got it. I did not get the point.

I have found no evidence that this adapter supports bluetooth. Is the capability turned off in hardware?

What I have is CF-953AX and can verify BT function working well in the card. Regarding the CF-951AX, there are some information about BT function. ( hope that is real :) )
https://www.sunsky-online.com/p/EDA003280201A/COMFAST-CF-951AX-1800Mbps-USB-3.0-WiFi6-Wireless-Network-Card-Black-.htm

[ 72.869871] Bluetooth: hci1: Opcode 0x c03 failed: -110
[ 74.882933] Bluetooth: hci1: Failed to read MSFT supported features (-110)
[ 76.896658] Bluetooth: hci1: AOSP get vendor capabilities (-110)

I guess you consider the three lines means the BT function not working properly, right? The log do not show up in my test environment(ubuntu2004+kernel 6.0-rc5 + CF-953AX) and I think this is related to BT protocol usage in your host system. Can you see BT still alive in your system, such as desktop UI? For example, I can see the device is running after CF-953AX plugged.

# hciconfig
hci1: Type: Primary Bus: USB
BD Address: XX:XX:XX:XX:XX:XX ACL MTU: 1021:4 SCO MTU: 96:6
UP RUNNING
RX bytes:17035 acl:0 sco:0 events:2758 errors:0
TX bytes:678011 acl:0 sco:0 commands:2756 errors:0

Regards,
Deren

@morrownr morrownr changed the title (Hot) Post Bug Reports for the new mt7921au chipset and mt7921u driver here... (Hot) Post your Bug Reports for the new mt7921au chipset and mt7921u driver here... Sep 19, 2022
@morrownr
Copy link
Owner Author

Hi @deren

I guess you consider the three lines means the BT function not working properly, right?

I consider those 3 lines are possibly going to give a hint as to why BT is not working. And it is not working. I wish I was more familiar with BT but I am not so I am having to go slow with my troubleshooting. I also have a little nano BT adapter with a Braodcome chipset. I plugged it in and BT works with it. Mt test distro is Mint 21 (updated to kernel 6.0 rc3) and a short look at the forums tells me Mint is having problems with BT so I am going to suspend my report pending further investigation. I appreciate your help.

Can you see BT still alive in your system,

Yes. It shows in the gui applet that supports BT. In fact, with the nano adapter plugged in, both show. The difference is that I can pair with other BT devices with the nano adapter but I can't get anything with the Mediatek based adapter. I'll continue trying to determine the problem.

Nick

@morrownr
Copy link
Owner Author

@deren

Question: Since you have a Comfast CF-953AX adapter, my question is have you tested AP mode 5 GHz band DFS channel support?

I bring this up due to the MANY users that make use of AP mode. Many products also include usb adapters and AP mode is sometimes an important part of the product. I am contacted at times to serve in a consulting roll so I see many use cases that would benefit greatly from this support... and the 5 Realtek drivers I have up here all support this and it works well. There is a gap in capability that needs to be closed and it doesn't seem to be working for me.

Thanks,

Nick

@leezu
Copy link

leezu commented Sep 28, 2022

On RasPi4B with Raspbian and kernel 6.0.0-rc7-v8+ (as well as earlier kernels), CF-953AX in AP mode and the latest September 2022 Firmware, I can relatively consistently reproduce firmware hangs by running a speedtest and apt upgrade in parallel on the SC7180 Snapdragon 7c. (Having other devices connected and other traffic patterns sometimes also causes the hang).

After the hang, the AP will recover and the following (starting from 4440.067187) is printed in the kernel log.

[Wed Sep 28 12:48:10 2022] mt7921u 2-1:1.3: WM Firmware Version: ____010000, Build Time: 20220908211021
[Wed Sep 28 12:48:11 2022] mt7921u 2-1:1.3 wlxe0e1a934a6a9: renamed from wlan1
[Wed Sep 28 12:48:12 2022] Bluetooth: hci0: Opcode 0x c03 failed: -110
[Wed Sep 28 12:48:13 2022] IPv6: ADDRCONF(NETDEV_CHANGE): ap0: link becomes ready
[Wed Sep 28 12:48:13 2022] IPv6: ADDRCONF(NETDEV_CHANGE): wlxe0e1a934a6a9: link becomes ready
[Wed Sep 28 14:02:00 2022] mt7921u 2-1:1.3: Message 00020003 (seq 12) timeout
[Wed Sep 28 14:02:00 2022] wlxe0e1a934a6a9: failed to remove key (0, 8c:fd:f0:42:20:cf) from hardware (-110)
[Wed Sep 28 14:02:00 2022] mt7921u 2-1:1.3: timed out waiting for pending tx
[Wed Sep 28 14:02:00 2022] ------------[ cut here ]------------
[Wed Sep 28 14:02:00 2022] WARNING: CPU: 2 PID: 1158 at kernel/kthread.c:659 kthread_park+0xb4/0xd0
[Wed Sep 28 14:02:00 2022] Modules linked in: xt_mark nft_chain_nat ctr aes_arm64 aes_generic ccm xt_MASQUERADE iptable_nat ip6table_nat nf_nat tun mt7921u mt7921_common mt76_connac_lib mt76_usb mt76 mac80211 btusb btrtl btintel btbcm bluetooth ecdh_generic ecc libaes libarc4 sg vc4 snd_soc_hdmi_codec bcm2835_codec(C) drm_display_helper brcmfmac rpivid_hevc(C) cec drm_cma_helper bcm2835_v4l2(C) bcm2835_isp(C) brcmutil v3d drm_kms_helper bcm2835_mmal_vchiq(C) v4l2_mem2mem videobuf2_vmalloc videobuf2_dma_contig gpu_sched videobuf2_memops cfg80211 snd_soc_core drm_shmem_helper videobuf2_v4l2 snd_compress videobuf2_common snd_bcm2835(C) snd_pcm_dmaengine videodev rfkill raspberrypi_hwmon snd_pcm vc_sm_cma(C) mc snd_timer snd syscopyarea sysfillrect sysimgblt fb_sys_fops uio_pdrv_genirq nvmem_rmem uio ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_comment xt_multiport nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
[Wed Sep 28 14:02:00 2022]  nft_compat nf_tables nfnetlink drm fuse drm_panel_orientation_quirks backlight ip_tables x_tables ipv6
[Wed Sep 28 14:02:00 2022] CPU: 2 PID: 1158 Comm: kworker/u8:1 Tainted: G         C         6.0.0-rc7-v8+ #4
[Wed Sep 28 14:02:00 2022] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[Wed Sep 28 14:02:00 2022] Workqueue: mt76 mt7921_mac_reset_work [mt7921_common]
[Wed Sep 28 14:02:00 2022] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[Wed Sep 28 14:02:00 2022] pc : kthread_park+0xb4/0xd0
[Wed Sep 28 14:02:00 2022] lr : mt76u_stop_tx+0x278/0x330 [mt76_usb]
[Wed Sep 28 14:02:00 2022] sp : ffffffc00938bc50
[Wed Sep 28 14:02:00 2022] x29: ffffffc00938bc50 x28: 0000000000000000 x27: ffffff8049ee8848
[Wed Sep 28 14:02:00 2022] x26: 0000000000000000 x25: ffffff8049e42280 x24: ffffff8049ee2068
[Wed Sep 28 14:02:00 2022] x23: ffffff8049ee4820 x22: ffffff8049ee6020 x21: ffffff8049ee2048
[Wed Sep 28 14:02:00 2022] x20: ffffff8049368c00 x19: ffffff804c8f0000 x18: 0000000000000000
[Wed Sep 28 14:02:00 2022] x17: 0000000000000001 x16: ffffffdcef6b4e40 x15: 0018a9ea46410578
[Wed Sep 28 14:02:00 2022] x14: 000dd98ec5e42494 x13: 0000000000000213 x12: 00000000fa83b2da
[Wed Sep 28 14:02:00 2022] x11: 0000000000000213 x10: 0000000000001a90 x9 : ffffffdccfd9e8a8
[Wed Sep 28 14:02:00 2022] x8 : ffffff80401ed8f0 x7 : 0000000000000001 x6 : ffffffdcf0b0b0c0
[Wed Sep 28 14:02:00 2022] x5 : ffffffdcf09a9000 x4 : ffffffdcf09a90b0 x3 : 0000000000002800
[Wed Sep 28 14:02:00 2022] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000004
[Wed Sep 28 14:02:00 2022] Call trace:
[Wed Sep 28 14:02:00 2022]  kthread_park+0xb4/0xd0
[Wed Sep 28 14:02:00 2022]  mt76u_stop_tx+0x278/0x330 [mt76_usb]
[Wed Sep 28 14:02:00 2022]  mt7921u_mac_reset+0x88/0x2d8 [mt7921u]
[Wed Sep 28 14:02:00 2022]  mt7921_mac_reset_work+0xac/0x1a0 [mt7921_common]
[Wed Sep 28 14:02:00 2022]  process_one_work+0x1dc/0x450
[Wed Sep 28 14:02:00 2022]  worker_thread+0x154/0x450
[Wed Sep 28 14:02:00 2022]  kthread+0x104/0x110
[Wed Sep 28 14:02:00 2022]  ret_from_fork+0x10/0x20
[Wed Sep 28 14:02:00 2022] ---[ end trace 0000000000000000 ]---
[Wed Sep 28 14:02:00 2022] mt7921u 2-1:1.3: HW/SW Version: 0x8a108a10, Build Time: 20220908210919a

[Wed Sep 28 14:02:00 2022] mt7921u 2-1:1.3: WM Firmware Version: ____010000, Build Time: 20220908211021
[Wed Sep 28 14:02:05 2022] mt7921u 2-1:1.3: Message 00020003 (seq 6) timeout
[Wed Sep 28 14:02:06 2022] mt7921u 2-1:1.3: timed out waiting for pending tx
[Wed Sep 28 14:02:06 2022] mt7921u 2-1:1.3: HW/SW Version: 0x8a108a10, Build Time: 20220908210919a

[Wed Sep 28 14:02:06 2022] mt7921u 2-1:1.3: WM Firmware Version: ____010000, Build Time: 20220908211021

@ghost
Copy link

ghost commented Sep 28, 2022

maybe some issues again with usb scatter gather

@leezu
Copy link

leezu commented Oct 1, 2022

@ayyyuki1 possible. Based on above call trace, mt7921u driver does call a mt76_usb function. I'll try reproducing the issue with /sys/module/mt76_usb/parameters/disable_usb_sg set to Y

@morrownr
Copy link
Owner Author

morrownr commented Oct 2, 2022

Based on above call trace, mt7921u driver does call a mt76_usb function. I'll try reproducing the issue with /sys/module/mt76_usb/parameters/disable_usb_sg set to Y

My opinion is that I would like to see either the default setting for scatter-gather be changed or simple clean the code out of mt76. I have seen the problems that it causes with mt7612u based adapters and I have taken the time to do extensive tests to see if it increases speed in any worthwhile level. My conclusion is that I cannot see any increases in speed. I say dump it from the code.

@bjlockie
Copy link

bjlockie commented Oct 2, 2022

I think the problem with scatter gather was suspected to be a bug in the USB hardware of a Raspberry Pi.

@leezu
Copy link

leezu commented Oct 2, 2022

With /sys/module/mt76_usb/parameters/disable_usb_sg=Y I'm also able to reproduce the hang, though it does appear a little harder to reproduce. Raspberry Pi team has recently made progress on a related USB issue and established that it was due to a VL805 firmware bug. raspberrypi/linux#4844 "VIA in Taiwan have reproduced the issue and are investigating. There's not likely to be another software workaround proposed here." "The fix as recommended by VIA is to disable bursts if this sequence of TRBs can occur." There was speculation if this helps with the Mediatek hangs (raspberrypi/linux#5173 (comment)), but I verified that it does not help. Given the history of RPi4 USB Host controller firmware bugs I've opened raspberrypi/linux#5193. I've also opened raspberrypi/linux#5192 to track that on RPi4, disconnecting one USB device causes failure on other USB devices (ie. disconnecting a mass storage device triggers mt7921au errors).

@bjlockie
Copy link

bjlockie commented Oct 3, 2022

Can the firmware of the VIA chip be upgraded?

@leezu
Copy link

leezu commented Oct 4, 2022

Yes.

% sudo rpi-eeprom-update
[...]
  VL805_FW: Using bootloader EEPROM
     VL805: up to date
   CURRENT: 000138a1
    LATEST: 000138a1

@morrownr
Copy link
Owner Author

morrownr commented Oct 8, 2022

On RasPi4B with Raspbian and kernel 6.0.0-rc7-v8+ (as well as earlier kernels), CF-953AX in AP mode and the latest September 2022 Firmware, I can relatively consistently reproduce firmware hangs by running a speedtest and apt upgrade in parallel on the SC7180 Snapdragon 7c. (Having other devices connected and other traffic patterns sometimes also causes the hang).

@leezu

I'm looking to move this bug report up into message 1 with the other bugs. You mentioned the distro and kernel but did not mention:

  • band/channel ?
  • hostapd ? and hostapd.conf
  • WPA3 ?
  • wpa_supplicant version ?
  • 32 bit ?
  • hostapd.log showing anything ?

@bjlockie
Copy link

bjlockie commented Oct 8, 2022 via email

@morrownr morrownr pinned this issue Oct 9, 2022
@leezu
Copy link

leezu commented Oct 12, 2022

@morrownr @bjlockie

I'm looking to move this bug report up into message 1 with the other bugs. You mentioned the distro and kernel but did not mention:

band/channel: 36
hostapd v2.10 and hostapd.conf
WPA3: wpa_key_mgmt=WPA-PSK
wpa_supplicant version: 2.9
32 bit: running with arm_64bit=1 RPi config but getconf LONG_BIT returns 32.
hostapd.log showing anything: Please see the two journals regarding two separate occurrences attached:
journal-no-backtrace.txt
journal-with-backtrace.txt

I also want to know if is using an extension cable.
And is it using a powered USB hub.
I had to use a powered USB hub for the keyboard/mouse of my hdmi switch.

It uses this extension lead https://smile.amazon.com/gp/product/B082HQXRZ1/ as without extension lead, the CF-953AX covers up the LAN port.

@morrownr
Copy link
Owner Author

@leezu

I added your firmware hanging issue as Bug 6. Please check it to see if I did a good job.

Also, I have been running AP mode 5 GHz over the few days with my RasPi4B. It appears I am seeing the same problem. Every few hours I will lose internet access. It can and usually does recover but it can take a long time. I shut down scatter/gather this morning to see if it helps. We will see.

Can I get you to test 2.4 GHz AP mode? Look at Bug 2. I am reporting that 2.4 GHz AP mode is not working. Would like confirmation.

wpa_supplicant version: 2.9

FYI: I think you need 2.10 for WPA3 support. I have a guide to do the upgrade if you want a copy.

Nick

@morrownr morrownr unpinned this issue Feb 2, 2023
@gifter77
Copy link

gifter77 commented May 8, 2023

Regarding bug 6., if I build a vanilla 6.4.rc linux kernel for my raspberry pi 4, I cannot reproduce the issue anymore.

@morrownr morrownr pinned this issue May 8, 2023
@morrownr
Copy link
Owner Author

morrownr commented May 8, 2023

Hi @gifter77

Copy all. Will update.

@viniciusmarangoni
Copy link

It seems Mediatek have a new firmware release for MT7921. I've seen this message in linux-wireless today:

From: Deren Wu <deren.wu@mediatek.com>

Update binary firmware for MT7921 WiFi devices

File: mediatek/WIFI_MT7961_patch_mcu_1_2_hdr.bin
Version: 20240219110958a
File: mediatek/WIFI_RAM_CODE_MT7961_1.bin
Version: 20240219111038

Reference: https://lore.kernel.org/linux-wireless/a12624f139cb1e03ad9b7551584d5d0b47d30b1a.1708678287.git.deren.wu@mediatek.com/T/#m7738138660b3c687c495620d20d8e0998fc45995

But the firmware is not yet available here: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/mediatek.

I hope the new firmware fixes some of the issues.

@tzhuan
Copy link

tzhuan commented Mar 1, 2024

After about 3 weeks without any issues, a new kind of error occurs:

Mar 01 10:01:21 ty kernel: xhci_hcd 0000:01:00.0: WARN: TRB error for slot 2 ep 7 on endpoint
Mar 01 10:01:21 ty kernel: mt7921u 2-1:1.3: tx urb failed: -84
Mar 01 10:01:21 ty kernel: xhci_hcd 0000:01:00.0: WARN waiting for error on ep to be cleared
Mar 01 10:01:21 ty kernel: mt7921u 2-1:1.3: tx urb submit failed:-22
Mar 01 10:04:22 ty kernel: mt7921u 2-1:1.3: timed out waiting for pending tx

The kernel is alive but the wifi is dead. A normal ifdown/ifup cannot bring the wifi back. I have to rmmod mt7921u and modprobe mt7921u before ifup.

@morrownr
Copy link
Owner Author

morrownr commented Mar 1, 2024

What is your distro and kernel version?

@tzhuan
Copy link

tzhuan commented Mar 2, 2024

What is your distro and kernel version?

It's a Raspberry Pi 4B box with 64-bit official Raspberry Pi OS. The system is up-to-date with Linux kernel 6.1.73.

$ uname -a
Linux ty 6.1.0-rpi8-rpi-v8 #1 SMP PREEMPT Debian 1:6.1.73-1+rpt1 (2024-01-25) aarch64 GNU/Linux

@morrownr
Copy link
Owner Author

morrownr commented Mar 2, 2024

I have a Pi4B with the 2-23-05-12 RaspPiOS. The kernel is the same as yours. I'm not seeing what you are seeing but mine is running in AP mode.

What is the output of:

$ ethtool -i wlan0

Replace wlan0 with your interface name.

I'm looking for the firmware version you are using. Something is different between our systems.

@tzhuan
Copy link

tzhuan commented Mar 3, 2024

I have a Pi4B with the 2-23-05-12 RaspPiOS. The kernel is the same as yours. I'm not seeing what you are seeing but mine is running in AP mode.

Mine is also running in AP mode with the hostapd-WiFi6.conf you provided.

What is the output of:

$ ethtool -i wlan0

Replace wlan0 with your interface name.

I'm looking for the firmware version you are using. Something is different between our systems.

$ ethtool -i wlan1
driver: mt7921u
version: 6.1.0-rpi8-rpi-v8
firmware-version: ____010000-20231109190959
expansion-rom-version: 
bus-info: 2-1:1.3
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

@morrownr
Copy link
Owner Author

morrownr commented Mar 3, 2024

Here is what I am looking at. My firmware is the original firmware that came with the 2023-12-05 release on the RasPiOS 64 bit. I had not upgraded yet but I'm not seeing any issues. You might try backing off to the same version I am using to see what happens.

 $ ethtool -i wlan0
driver: mt7921u
version: 6.1.0-rpi8-rpi-v8
firmware-version: ____010000-20230117170942
expansion-rom-version: 
bus-info: 2-2:1.3
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

xhci_hcd 0000:01:00.0: WARN: TRB error for slot 2 ep 7 on endpoint

That makes me thing there is a problem with usb. Have you tried the other usb3 port?

@Simon566
Copy link

#410
#331

this is all the same topic !! pending tx messages and a kernel crash

@whitslack
Copy link

this is all the same topic !! pending tx messages and a kernel crash

I was seeing those crashes every day or few until I switched to a simpler (read: worse) HostAPd configuration at @morrownr's suggestion. I then ran my access point for 3½ months straight with not a single crash or hang and only rebooted it once in that time to upgrade the kernel. Moreover, with the simpler config my Wi-Fi network was more reliably visible: whereas previously I often would need to scan several times on both my phone and my laptop to find my network and connect to it, after I dumbed down the config, then both devices consistently found the network on their first scan. This leads me to suspect that there is something amiss with the transmit multi-queuing in the driver and/or firmware for this chipset since it doesn't even appear to send out the beacon frames entirely on pace when I use my preferred config.

Since 17 March 2024, I have switched to a slightly less crippled HostAPd configuration that now enables 802.11n+WMM and 256-bit ciphers and in theory 802.11ax, although my AP will never actually enable AX mode since I have many legacy client devices in my network. I have had zero hangs, crashes, or slowdowns with this config in the first week of testing it.

interface=wifi6
bridge=br0
driver=nl80211
ctrl_interface=/var/run/hostapd
ctrl_interface_group=0
ssid=Whitslack
country_code=US
hw_mode=g
channel=6
preamble=1
macaddr_acl=0
auth_algs=1
ignore_broadcast_ssid=0
wmm_enabled=1
ieee80211n=1
ht_capab=[LDPC][HT40+][HT40-][GF][SHORT-GI-20][SHORT-GI-40][TX-STBC][RX-STBC1][MAX-AMSDU-7935]
require_ht=1
vht_capab=[MAX-MPDU-11454][RXLDPC][SHORT-GI-80][TX-STBC-2BY1][RX-STBC-1][SU-BEAMFORMEE][BF-ANTENNA-4][MAX-A-MPDU-LEN-EXP7][RX-ANTENNA-PATTERN][TX-ANTENNA-PATTERN]
ieee80211ax=1
wpa=2
wpa_passphrase=##REDACTED##
wpa_key_mgmt=WPA-PSK WPA-PSK-SHA256
rsn_pairwise=CCMP CCMP-256

My hope is that, by titration and process of elimination, I can eventually find exactly the HostAPd option that these drivers/firmware/chipset don't like, and then we will be able to send an actionable bug report to Mediatek.

@Simon566
Copy link

Hi,

here is mine, i have some wmm params set and some for he. others are on their default and could be left away like beacon interval and such

interface=wlx00c0cab38118
bridge=br-iot
driver=nl80211

logger_syslog=1
logger_syslog_level=3
logger_stdout=1
logger_stdout_level=3

ctrl_interface=/var/run/hostapd
ctrl_interface_group=0

ssid=iot22
country_code=DE
ieee80211d=1
hw_mode=g
channel=6

beacon_int=100
dtim_period=2
max_num_sta=14
rts_threshold=2347
fragm_threshold=2346

macaddr_acl=0
auth_algs=3
ignore_broadcast_ssid=0

wmm_enabled=1
uapsd_advertisement_enabled=1

wmm_ac_bk_cwmin=4
wmm_ac_bk_cwmax=10
wmm_ac_bk_aifs=7
wmm_ac_bk_txop_limit=0
wmm_ac_bk_acm=0

wmm_ac_be_aifs=3
wmm_ac_be_cwmin=4
wmm_ac_be_cwmax=10
wmm_ac_be_txop_limit=0
wmm_ac_be_acm=0

wmm_ac_vi_aifs=2
wmm_ac_vi_cwmin=3
wmm_ac_vi_cwmax=4
wmm_ac_vi_txop_limit=94
wmm_ac_vi_acm=0

wmm_ac_vo_aifs=2
wmm_ac_vo_cwmin=2
wmm_ac_vo_cwmax=3
wmm_ac_vo_txop_limit=47
wmm_ac_vo_acm=0

skip_inactivity_poll=1
ieee80211n=1

ht_capab=[LDPC][HT40+][HT40-][GF][SHORT-GI-20][SHORT-GI-40][TX-STBC][RX-STBC1][MAX-AMSDU-7935]
vht_capab=[RXLDPC][SHORT-GI-80][TX-STBC-2BY1][SU-BEAMFORMEE][MU-BEAMFORMEE][RX-ANTENNA-PATTERN][TX-ANTENNA-PATTERN][RX-STBC-1][BF-ANTENNA-4][MAX-MPDU-11454][MAX-A-MPDU-LEN-EXP7]

ieee80211ax=1

he_su_beamformee=1
he_mu_beamformer=1
he_bss_color=13
he_basic_mcs_nss_set=2

wpa=2
wpa_passphrase=
wpa_pairwise=CCMP
rsn_pairwise=CCMP

@reticulatingsplines
Copy link

reticulatingsplines commented Mar 24, 2024

Wanted to share my own experience with the timed out waiting for pending tx issue. The AP goes down under conditions I can generally recreate - "heavy load". In real-world use, this has been e.g. downloading a large file, but artificially I can usually get it to crash within 10-30 seconds of a continuous iperf3 test with the AP in server mode (while bandwidth reported by the iperf3 client hovers around 400-600 Mbit/s). On more than one occasion, it appears to have crashed while simply running an AP scan from a laptop while it's connected to the AP in question.

The AP appears to hang once this happens; it stops responding to pings or SSH connection attempts from the upstream network, the display output freezes completely, and the device doesn't seem to respond to keyboard inputs. However, a bash session via serial connection (using the OPi5+'s debug UART pins) seems stable and responsive through the AP "crash", and I can actually revive the AP through a simple systemctl restart hostapd command.

I'm happy to help in any way I can. If there's more logging that would be useful, let me know. If anybody has any ideas for changes to make, also happy to modify/compile/test kernels/software/firmware/whatever.

Hardware: Recently "upgraded" (sigh) my AP with an ALFA AWUS036AXML, which I'm using in conjunction with an Orange Pi 5+. I've been running the AP with ALFA unit attached via one of the OPi5+'s USB 3.0 Type-A ports, which interfaces with the RK3588 SOC through a hub chip (GL3523). However, I can recreate the same issue when the ALFA unit is connected through the OPi5+'s Type-C port, which, according to Orange Pi's schematics, is routed directly to one of the SOC's two USB3.1 interfaces. It's worth noting that the Type-C port has a separate, dedicated PMIC, which may be evidence against this being a power supply issue, at least on the host side.

Software: running a bookworm release of Debian, kernel 6.1.43, with relevant mediatek drivers built as kernel modules. Some relevant outputs are below.

neofetch:

orangepi@orangepi5:~$ neofetch
        #####           orangepi@orangepi5
       #######          ------------------
       ##O#O##          OS: Orange Pi 1.0.8 Bookworm aarch64
       #######          Host: RK3588 OPi 5 Plus
     ###########        Kernel: 6.1.43-rockchip-rk3588
    #############       Uptime: 20 hours, 40 mins
   ###############      Packages: 1804 (dpkg)
   ################     Shell: bash 5.2.15
  #################     WM: Xfwm4
#####################   WM Theme: Numix
#####################   Theme: Adwaita [GTK3]
  #################     Icons: Adwaita [GTK3]
                        Terminal: /dev/ttyFIQ0
                        CPU: (8) @ 1.800GHz
                        Memory: 526MiB / 15964MiB

hostapd version:
Note: Issue also occured with currently available hostapd from debian bookworm sources (v2.10-12)

orangepi@orangepi5:~$ hostapd -v
hostapd v2.11-devel-hostap_2_10-2014-g695277a5b

ethtool:
Note: Issue also occured using the previous 2 firmware builds (20231109 and 20230526)

orangepi@orangepi5:~$ sudo ethtool -i wlx00c0cab5bc5c
driver: mt7921u
version: 6.1.43-rockchip-rk3588
firmware-version: ____010000-20240219111038
expansion-rom-version:
bus-info: 8-1:1.3
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

systool for mt76_usb module:
Note: Issue reproducible with disable_usb_sg = "Y" as well

orangepi@orangepi5:~$ sudo systool -m mt76_usb -av
Module = "mt76_usb"

  Attributes:
    coresize            = "32768"
    initsize            = "0"
    initstate           = "live"
    refcnt              = "1"
    taint               = ""
    uevent              = <store method only>

  Parameters:
    disable_usb_sg      = "N"

  Sections:
    .altinstructions    = "0xffffffc00153e100"
    .bss                = "0xffffffc001540bc0"
    .data.once          = "0xffffffc001540608"
    .data               = "0xffffffc001540140"
    .gnu.linkonce.this_module= "0xffffffc001540880"
    .init.plt           = "0xffffffc001543000"
    .note.Linux         = "0xffffffc00153e024"
    .note.gnu.build-id  = "0xffffffc00153e000"
    .plt                = "0xffffffc00153d200"
    .ref.data           = "0xffffffc001540610"
    .rodata             = "0xffffffc00153e280"
    .rodata.str         = "0xffffffc00153e57c"
    .rodata.str1.8      = "0xffffffc00153e400"
    .strtab             = "0xffffffc001545848"
    .symtab             = "0xffffffc001544000"
    .text.ftrace_trampoline= "0xffffffc00153d20c"
    .text               = "0xffffffc00153a000"
    __bpf_raw_tp_map    = "0xffffffc0015406a0"
    __bug_table         = "0xffffffc001540000"
    __jump_table        = "0xffffffc00153f000"
    __ksymtab_gpl       = "0xffffffc00153e054"
    __ksymtab_strings   = "0xffffffc00153e309"
    __param             = "0xffffffc00153e5d8"
    __patchable_function_entries= "0xffffffc001540028"
    __tracepoints_ptrs  = "0xffffffc00153e600"
    __tracepoints_strings= "0xffffffc00153e610"
    __tracepoints       = "0xffffffc001540720"
    _ftrace_events      = "0xffffffc001540840

dmesg

[28680.798783] mt7921u 2-1.4:1.3: Message 00020002 (seq 10) timeout
[28681.028634] mt7921u 2-1.4:1.3: timed out waiting for pending tx
[28681.047227] ------------[ cut here ]------------
[28681.047254] WARNING: CPU: 0 PID: 6689 at kernel/kthread.c:659 kthread_park+0xb8/0xd0
[28681.047312] Modules linked in: overlay bridge stp llc nft_chain_nat xt_MASQUERADE nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink pwm_fan mt7921u mt7921e mt7921_common mt76_connac_lib mt76_usb mt76 fuse ip_tables r8169
[28681.047569] CPU: 0 PID: 6689 Comm: kworker/u16:4 Not tainted 6.1.43-rockchip-rk3588 #1.0.8
[28681.047596] Hardware name: RK3588 OPi 5 Plus (DT)
[28681.047615] Workqueue: mt76 mt7921_mac_reset_work [mt7921_common]
[28681.047707] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[28681.047732] pc : kthread_park+0xb8/0xd0
[28681.047758] lr : mt76u_stop_tx+0x298/0x360 [mt76_usb]
[28681.047808] sp : ffffffc014de3c40
[28681.047820] x29: ffffffc014de3c40 x28: 0000000000000000 x27: ffffff8103ac8b58
[28681.047863] x26: ffffff8103ac08e0 x25: 0000000000000000 x24: ffffff810a453680
[28681.047903] x23: ffffff8103ac2128 x22: ffffff8103ac48e0 x21: ffffff8103ac2108
[28681.047943] x20: ffffff8112289e80 x19: ffffff81080dba00 x18: 0000000000000000
[28681.047982] x17: 000000040044ffff x16: 000000b2b5593519 x15: ffffff84fde8a480
[28681.048022] x14: 00000000000003d6 x13: 0000000000000001 x12: 00000000000003d6
[28681.048059] x11: 0000000000000001 x10: 0000000000000a90 x9 : ffffffc00153c86c
[28681.048097] x8 : ffffff8115bab670 x7 : 0000000000001800 x6 : 0000000000000195
[28681.048134] x5 : ffffffc00a623fe0 x4 : 0000000000000000 x3 : 0000000000002800
[28681.048171] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000004
[28681.048211] Call trace:
[28681.048224]  kthread_park+0xb8/0xd0
[28681.048252]  mt76u_stop_tx+0x298/0x360 [mt76_usb]
[28681.048290]  mt7921u_mac_reset+0x84/0x284 [mt7921u]
[28681.048333]  mt7921_mac_reset_work+0xb0/0x1d0 [mt7921_common]
[28681.048378]  process_one_work+0x1e8/0x454
[28681.048412]  worker_thread+0x174/0x52c
[28681.048441]  kthread+0xdc/0xe0
[28681.048465]  ret_from_fork+0x10/0x20

@whitslack
Copy link

which may be evidence against this being a power supply issue, at least on the host side.

Almost certainly not a host power supply issue, as it was happening to me on an Intel Atom-based mini-ITX system with a 60W DC-DC PSU attached to a 12V 5A AC adapter that pulls under 15W at the wall while powering the system (so, lots of headroom).

The AP appears to hang once this happens; it stops responding to pings or SSH connection attempts from the upstream network, the display output freezes completely, and the device doesn't seem to respond to keyboard inputs.

Except in the very early days (before I found and fixed the packet headroom bug in the driver), I have never actually seen the timed out waiting for pending tx bug cause my AP machine to fully crash/panic. I am always able to connect to a shell via the wired Ethernet interface. (Admittedly, the machine is headless, so I don't know whether the console is responsive at this point.) Sometimes restarting HostAPd gets the AP going again, but usually the driver is well and truly hosed at that point, and a restarted HostAPd fails to configure that interface, and only a (soft) reboot will get it working again. (Maybe unbinding and rebinding the kernel driver would work too; I don't recall whether I've tried that with success.)

@whitslack
Copy link

Since 17 March 2024, I have switched to a slightly less crippled HostAPd configuration that now enables 802.11n+WMM and 256-bit ciphers and in theory 802.11ax

Just got an interface hang on my "slightly less crippled HostAPd configuration." It happened while I was downloading a large file over Wi-Fi. I was able to bring down the interface with ip link set wifi6 down. The command took a little bit of time to complete, and when it did complete, I found both an rx urb mismatch dump and a timed out waiting for pending tx dump in the kernel log. I then brought the interface back up with ip link set wifi6 up, and evidently it's working fine again. Did not have to reboot, and did not even have to restart HostAPd, meaning my other Wi-Fi interface and the wired interface did not need to be interrupted to get the mt7921 running again. I'll keep trying with this config, but if the hang happens again, I may titrate back the other direction to try to zero in on exactly which option is causing this instability.

@Simon566
Copy link

this is the same topic that im experiencing .... the hostapd config doesnt really matter

@Simon566
Copy link

Simon566 commented Apr 3, 2024

question about USB3 mode and AMD RZ608 / mt7921u :

in lsusb i see my adapter running in USB-2 mode , but isnt it supposed to be USB-3 ? is it possible to switch the speed ?

@morrownr
Copy link
Owner Author

morrownr commented Apr 3, 2024

AMD RZ608

I've never seen that chipset used in a usb adapter.

The mt7921u driver automatically sets USB3 if all conditions are met for USB3 to be used. Same thing with the mt7612u driver. I've never seen them make a mistake. Not saying that it is not possible, just that I've never seen it.

@Simon566
Copy link

Simon566 commented Apr 3, 2024

okay , im running on X86-64 Jasper Lake hardware with the thing plugged into a USB3 port on latest longterm kernel 6.6.23 and it runs in USB2 mode (480M). its the ALFA usb adapter ..... i have never seen it in USB3 mode

root@odroid-h3:~# lsusb -t -v
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 10000M
ID 1d6b:0003 Linux Foundation 3.0 root hub
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/8p, 480M
ID 1d6b:0002 Linux Foundation 2.0 root hub
|__ Port 1: Dev 2, If 3, Class=Vendor Specific Class, Driver=mt7921u, 480M
ID 0e8d:7961 MediaTek Inc.
|__ Port 1: Dev 2, If 1, Class=Wireless, Driver=btusb, 480M
ID 0e8d:7961 MediaTek Inc.
|__ Port 1: Dev 2, If 2, Class=Wireless, Driver=, 480M
ID 0e8d:7961 MediaTek Inc.
|__ Port 1: Dev 2, If 0, Class=Wireless, Driver=btusb, 480M
ID 0e8d:7961 MediaTek Inc.
|__ Port 2: Dev 9, If 0, Class=Vendor Specific Class, Driver=rtl88XXau, 480M
ID 0bda:8812 Realtek Semiconductor Corp. RTL8812AU 802.11a/b/g/n/ac 2T2R DB WLAN Adapter
|__ Port 3: Dev 4, If 0, Class=Human Interface Device, Driver=usbhid, 12M
ID 046d:c52b Logitech, Inc. Unifying Receiver
|__ Port 3: Dev 4, If 1, Class=Human Interface Device, Driver=usbhid, 12M
ID 046d:c52b Logitech, Inc. Unifying Receiver
|__ Port 3: Dev 4, If 2, Class=Human Interface Device, Driver=usbhid, 12M
ID 046d:c52b Logitech, Inc. Unifying Receiver
|__ Port 4: Dev 5, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 12M
ID 0403:6001 Future Technology Devices International, Ltd FT232 Serial (UART) IC

@Simon566
Copy link

Simon566 commented Apr 3, 2024

ok sorry when using the proper cable , then its USB3:

/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 10000M
ID 1d6b:0003 Linux Foundation 3.0 root hub
|__ Port 2: Dev 2, If 0, Class=Wireless, Driver=btusb, 5000M
ID 0e8d:7961 MediaTek Inc.
|__ Port 2: Dev 2, If 1, Class=Wireless, Driver=btusb, 5000M
ID 0e8d:7961 MediaTek Inc.
|__ Port 2: Dev 2, If 2, Class=Wireless, Driver=, 5000M
ID 0e8d:7961 MediaTek Inc.
|__ Port 2: Dev 2, If 3, Class=Vendor Specific Class, Driver=mt7921u, 5000M
ID 0e8d:7961 MediaTek Inc.
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/8p, 480M
ID 1d6b:0002 Linux Foundation 2.0 root hub
|__ Port 2: Dev 9, If 0, Class=Vendor Specific Class, Driver=rtl88XXau, 480M
ID 0bda:8812 Realtek Semiconductor Corp. RTL8812AU 802.11a/b/g/n/ac 2T2R DB WLAN Adapter
|__ Port 3: Dev 4, If 0, Class=Human Interface Device, Driver=usbhid, 12M
ID 046d:c52b Logitech, Inc. Unifying Receiver
|__ Port 3: Dev 4, If 1, Class=Human Interface Device, Driver=usbhid, 12M
ID 046d:c52b Logitech, Inc. Unifying Receiver
|__ Port 3: Dev 4, If 2, Class=Human Interface Device, Driver=usbhid, 12M
ID 046d:c52b Logitech, Inc. Unifying Receiver
|__ Port 4: Dev 5, If 0, Class=Vendor Specific Class, Driver=ftdi_sio, 12M
ID 0403:6001 Future Technology Devices International, Ltd FT232 Serial (UART) IC
root@odroid-h3:~#

@morrownr
Copy link
Owner Author

morrownr commented Apr 3, 2024

ok sorry when using the proper cable , then its USB3:

Been there, done that. USB3 is just not going to work with a USB2 cable or hub in the mix.

@MEL1H
Copy link

MEL1H commented Apr 20, 2024

Hi all,

With my AWUS036AXML I had 1 common issue which is I believe I solved, and another 1 is a kind of solved.

I have raspberry pi before 4 now 5, bookworm aarch64. Kernel was original but I tend to update it to latest one so it really does not matter. I use dhcpd+dnsmasq+hostapd setup. I took hostapd from morrownr's config but I changed to run with wlan0 directly, not bridge mode.

  1. waiting for pendinx tx ,dissappeared ssid, cannot joinable ssid, hostapd restarts itself while speedtesting(after speed 350mbps)...

I solved these issues with adding wireless-power off to /etc/network/interfaces. With that Iwconfig shows Power Management:off. Ap survives at speedtests. See below my latest speed test (I was really next to awus036AXML). Zero disconnects.

  1. This one is more dangerous cause it stucks kernel so rpi. Lastly, I added options mt76_usb disable_usb_sg=1 to /etc/modprobe.d/mt76_usb.conf. I do not see that issue for 1 week. Users were saying that it does not solve problem but decrease the frequency. I will see if that happens.

Screenshot_20240420_205138_Speedtest

Screenshot_20240420_211802_UserLAnd

@whitslack
Copy link

I solved these issues with adding wireless-power off to /etc/network/interfaces. With that Iwconfig shows Power Management:off.

@MEL1H: Very interesting find, but I wonder how your system is able to change the power management mode. When I try it with iw, I get:

# iw dev wifi6 get power_save
Power save: on
# iw dev wifi6 set power_save off
command failed: Operation not supported (-95)

I cannot use iwconfig at all, as that tool is meant for the deprecated wireless extensions that my kernel does not support:

# iwconfig wifi6
wifi6     no wireless extensions.

It's shocking to me that HostAPd does not disable power saving on an AP interface by default. APs are not supposed to enter any 802.11 power saving modes, I'm pretty sure. Maybe HostAPd assumes that the driver will ignore any power-saving mode settings when operating in AP mode, but the mt7921u driver actually does still apply power-saving modes to the chipset even when running in AP mode. That would certainly explain some things, although ideally power saving should at least work in AP mode, even if performance is lousy.

What follows is pure conjecture on my part. Maybe the reason I am seeing no hangs when I disable WMM is that then the chipset transmits only on its primary queue, which does correctly toggle power-saving modes as necessary to service the queue. Perhaps the additional transmit queues utilized by WMM are not correctly tied into power management, such that there can arise a rare edge case wherein the primary queue is empty, the PHY is in power-saving mode, and yet a WMM queue is not empty. The non-empty WMM queue cannot be serviced while in power-saving mode, leading to a hang and an eventual transmit timeout.

On a completely separate tack, there is another somewhat recent patch pending merge into a kernel release that has me hopeful: wifi: mt76: mt792xu: enable dmashdl support, whose commit message reads: "dmashdl(DMA scheduler) was disable and may cause packets corruption without propoer resource handling. Enable this to control resources between usb-bus/pse/hardware-ac-queue." The mentions of "packets corruption" and "hardware [802.11]ac queue" seem very relevant to our issue at hand. I have not attempted to merge this patch into my kernel myself.

@MEL1H
Copy link

MEL1H commented Apr 20, 2024

@whitslack it is actually a coincidence. At the very begining, I have tried many commands related with iw to make power saving off however all failed. Then I decided to use rpi onboard wifi at same time with AWUS036AXML, when I prepared the interfaces for onboard wifi I just gave a shot for awus also then somehow it worked.

Screenshot_20240421_002020_ConnectBot

Screenshot_20240421_001815_ConnectBot

I hope that patch will solve the other issue permanently. Thanks for informing.

@morrownr
Copy link
Owner Author

morrownr commented Apr 21, 2024

@whitslack

> # iw dev wifi6 set power_save off
> command failed: Operation not supported (-95)

Was the interface up when you ran the command? Some commands only work if the interface is down. To check:

$ ip a

To take the interface down:

$ sudo ip link set <wlan0> down

I did a quick test on the easiest thing to test which is my wifi card with a mt7922 chip that uses the mt7921e driver and it worked... turned power off. The interface was down when I tested. I can take a mt7921au interface down on another system and test if you want.

there is another somewhat recent patch pending merge into a kernel release that has me hopeful:

I saw that as well.

@Simon566
Copy link

i tried the dma patch you mentioned on top of the LTS kernel 6.6.28 , it didnt solve the "tx urb " errors or the others .... i switched power management off while the adapter was turned off and at least this worked.

I tried the 3-4 most promising patches from master tree of mt76 , no success

)
[So Apr 21 15:01:56 2024] Mirror/redirect action on
[So Apr 21 15:03:19 2024] mt7921u 2-2:1.3: tx urb failed: -71
[So Apr 21 15:03:20 2024] mt7921u 2-2:1.3: tx urb failed: -71
[So Apr 21 15:03:20 2024] mt7921u 2-2:1.3: tx urb failed: -71
[So Apr 21 15:03:20 2024] mt7921u 2-2:1.3: tx urb failed: -71
[So Apr 21 15:03:20 2024] mt7921u 2-2:1.3: tx urb failed: -71

@MEL1H
Copy link

MEL1H commented Apr 21, 2024

Today my rpi stuck again. when I checked the journal I found that message md-ap systemd-udevd[185]: libkmod: ERROR ../libkmod/libkmod-config.c:712 kmod_config_parse: /etc/modprobe.d/mt76-usb.conf line 1: ignoring bad line starting with 'mt76-usb'. But this grep [[:alnum:]] /sys/module/mt76_usb/parameters/* shows Y. I wonder what is wrong..

@whitslack
Copy link

ignoring bad line starting with 'mt76-usb'

@MEL1H: Wouldn't the module be named mt76_usb, not mt76-usb? But also, you wouldn't start a line with the name of a module; you need a configuration directive, like options. See the man page.

@MEL1H
Copy link

MEL1H commented Apr 21, 2024

@whitslack
Thank you for clarifying. I mixed up them at past ( found that there was one for mt-76 and one for mt_76). I removed mt-76 conf with your comment.

@whitslack
Copy link

whitslack commented May 1, 2024

Was the interface up when you ran the command? Some commands only work if the interface is down.

@morrownr: That was indeed the problem. iw dev ____ set power_save off works fine while the interface is down.

In related news, I discovered that even my extremely crippled config, although much more stable than WMM-enabled configs, still can exhibit the hang. I am now testing that same config with power saving turned off. Fingers crossed that this is the magic bullet!

Update: Nope. Even with power saving disabled and using the extremely crippled (pre-802.11n) config, it is still possible for the driver to wedge itself into a corner and die. A simple ip link set wifi6 down (wait for a minute while the driver hangs and eventually recovers before the command returns) and then ip link set wifi6 up is enough to get it running again, without even restarting HostAPd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests