Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mt7921u pending tx #410

Open
Simon566 opened this issue Mar 22, 2024 · 5 comments
Open

mt7921u pending tx #410

Simon566 opened this issue Mar 22, 2024 · 5 comments

Comments

@Simon566
Copy link

  • kernel 6.6.22
  • x86-64 , intel
  • Odroid H3
  • ALFA USB stick

[Do Mär 21 22:48:23 2024] mt7921u 1-1:1.3: Message 00020003 (seq 6) timeout
[Do Mär 21 22:48:24 2024] mt7921u 1-1:1.3: timed out waiting for pending tx
[Do Mär 21 22:48:24 2024] mt7921u 1-1:1.3: HW/SW Version: 0x8a108a10, Build Time: 20240219110958a

[Do Mär 21 22:48:24 2024] mt7921u 1-1:1.3: WM Firmware Version: ____010000, Build Time: 20240219111038
[Fr Mär 22 06:17:31 2024] mt7921u 1-1:1.3: vendor request req:63 off:d02c failed:-110
[Fr Mär 22 06:17:34 2024] mt7921u 1-1:1.3: vendor request req:63 off:d054 failed:-110
[Fr Mär 22 06:17:37 2024] mt7921u 1-1:1.3: vendor request req:63 off:d058 failed:-110
[Fr Mär 22 06:17:41 2024] mt7921u 1-1:1.3: vendor request req:63 off:53b8 failed:-110

.....alot more of the -110

[Fr Mär 22 06:42:20 2024] task:kworker/u8:3 state:D stack:0 pid:1179348 ppid:2 flags:0x00004000
[Fr Mär 22 06:42:20 2024] Workqueue: phy0 mt792x_mac_work [mt792x_lib]
[Fr Mär 22 06:42:20 2024] Call Trace:
[Fr Mär 22 06:42:20 2024]
[Fr Mär 22 06:42:20 2024] __schedule+0x3c4/0xb50
[Fr Mär 22 06:42:20 2024] schedule+0x61/0xe0
[Fr Mär 22 06:42:20 2024] schedule_preempt_disabled+0x18/0x30
[Fr Mär 22 06:42:20 2024] __mutex_lock.constprop.0+0x3b4/0x700
[Fr Mär 22 06:42:20 2024] mt792x_mac_work+0x28/0xb0 [mt792x_lib]
[Fr Mär 22 06:42:20 2024] process_one_work+0x171/0x340
[Fr Mär 22 06:42:20 2024] worker_thread+0x27b/0x3a0
[Fr Mär 22 06:42:20 2024] ? __pfx_worker_thread+0x10/0x10
[Fr Mär 22 06:42:20 2024] kthread+0xf4/0x130
[Fr Mär 22 06:42:20 2024] ? __pfx_kthread+0x10/0x10
[Fr Mär 22 06:42:20 2024] ret_from_fork+0x31/0x50
[Fr Mär 22 06:42:20 2024] ? __pfx_kthread+0x10/0x10
[Fr Mär 22 06:42:20 2024] ret_from_fork_asm+0x1b/0x30
[Fr Mär 22 06:42:20 2024]
[Fr Mär 22 06:42:20 2024] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
[Fr Mär 22 06:42:22 2024] mt7921u 1-1:1.3: vendor request req:63 off:4230 failed:-110
[Fr Mär 22 06:42:25 2024] mt7921u 1-1:1.3: vendor request req:63 off:4230 failed:-110

@Simon566
Copy link
Author

here is the next:
[Sa Mär 23 16:32:33 2024] mt7921u 1-1:1.3: Message 00020003 (seq 7) timeout
[Sa Mär 23 16:32:34 2024] mt7921u 1-1:1.3: timed out waiting for pending tx
[Sa Mär 23 16:32:34 2024] ------------[ cut here ]------------
[Sa Mär 23 16:32:34 2024] WARNING: CPU: 3 PID: 544288 at kernel/kthread.c:659 kthread_park+0x85/0xa0
[Sa Mär 23 16:32:34 2024] Modules linked in: nf_conntrack_netlink cls_matchall sch_ingress sch_cake act_mirred ifb nft_masq nft_nat nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard veth libchacha20poly1305 chacha_x86_64 poly1305_x86_64 curve25519_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel ccm nf_tables pppoe pppox crc32c_generic nfnetlink ppp_generic slhc nvme_fabrics msr bridge binfmt_misc snd_sof_pci_intel_icl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp mt7921u snd_sof snd_hda_codec_hdmi mt792x_usb snd_sof_utils snd_soc_hdac_hda mt7921_common snd_hda_codec_realtek snd_hda_codec_generic snd_hda_ext_core ledtrig_audio snd_soc_acpi_intel_match mt792x_lib x86_pkg_temp_thermal snd_soc_acpi intel_powerclamp snd_soc_core mt76_connac_lib kvm_intel snd_compress soundwire_bus mt76_usb i915 snd_hda_intel btusb mt76 kvm nls_ascii btrtl nls_cp437 btintel snd_intel_dspcfg
[Sa Mär 23 16:32:34 2024] vfat btbcm mac80211 snd_intel_sdw_acpi btmtk irqbypass bluetooth fat snd_hda_codec ghash_clmulni_intel 88XXau(OE) sha256_ssse3 libarc4 sha1_ssse3 snd_hda_core drm_buddy drm_display_helper snd_hwdep intel_rapl_msr mei_pxp cfg80211 processor_thermal_device_pci_legacy mei_hdcp aesni_intel sha3_generic jitterentropy_rng cec processor_thermal_device mei_me crypto_simd snd_pcm sha512_ssse3 sha512_generic processor_thermal_rfim cryptd ctr drbg iTCO_wdt snd_timer ansi_cprng processor_thermal_mbox ftdi_sio rc_core ecdh_generic mei intel_pmc_bxt processor_thermal_rapl ecc ttm iTCO_vendor_support usbserial snd nfsd intel_rapl_common rfkill watchdog pcspkr drm_kms_helper wmi_bmof crc16 intel_cstate ee1004 soundcore int340x_thermal_zone i2c_algo_bit intel_soc_dts_iosf intel_pmc_core evdev acpi_pad joydev sg button acpi_tad auth_rpcgss nfs_acl lockd it87(OE) hwmon_vid grace emc2103 coretemp sunrpc 8021q drm garp stp mrp llc dm_mod fuse configfs loop efi_pstore efivarfs ip_tables x_tables autofs4 btrfs xor raid6_pq
[Sa Mär 23 16:32:34 2024] libcrc32c hid_logitech_hidpp hid_logitech_dj hid_generic usbhid hid sd_mod nvme ahci libahci nvme_core libata xhci_pci xhci_hcd t10_pi r8169 crc64_rocksoft crc64 crc_t10dif realtek i2c_i801 scsi_mod crct10dif_generic mdio_devres usbcore i2c_smbus libphy crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel intel_lpss_pci scsi_common intel_lpss usb_common idma64 video wmi
[Sa Mär 23 16:32:34 2024] CPU: 3 PID: 544288 Comm: kworker/u8:11 Tainted: G U OE 6.6.22 #5
[Sa Mär 23 16:32:34 2024] Hardware name: HARDKERNEL ODROID-H3/ODROID-H3, BIOS 5.19 07/19/2023
[Sa Mär 23 16:32:34 2024] Workqueue: mt76 mt7921_mac_reset_work [mt7921_common]
[Sa Mär 23 16:32:34 2024] RIP: 0010:kthread_park+0x85/0xa0
[Sa Mär 23 16:32:34 2024] Code: 00 48 85 c0 74 2d 31 c0 5b 5d c3 cc cc cc cc 0f 0b 48 8b ab 40 0a 00 00 a8 04 74 ac 0f 0b b8 da ff ff ff 5b 5d c3 cc cc cc cc <0f> 0b b8 f0 ff ff ff eb d5 0f 0b eb cf 66 66 2e 0f 1f 84 00 00 00
[Sa Mär 23 16:32:34 2024] RSP: 0018:ffffb24346c5bd58 EFLAGS: 00010202
[Sa Mär 23 16:32:34 2024] RAX: 0000000000000004 RBX: ffff9c39cc2cb100 RCX: 0000000000000100
[Sa Mär 23 16:32:34 2024] RDX: 0000000080000000 RSI: 0000000000000283 RDI: ffff9c39cc2cb100
[Sa Mär 23 16:32:34 2024] RBP: ffff9c3a16415200 R08: ffffffffc051e368 R09: 0000000000000000
[Sa Mär 23 16:32:34 2024] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9c39cd8c20a8
[Sa Mär 23 16:32:34 2024] R13: ffff9c3a168743a8 R14: ffff9c39cd8c20a8 R15: 0000000000000100
[Sa Mär 23 16:32:34 2024] FS: 0000000000000000(0000) GS:ffff9c3d2ff80000(0000) knlGS:0000000000000000
[Sa Mär 23 16:32:34 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Sa Mär 23 16:32:34 2024] CR2: 00007f8e9c554000 CR3: 0000000442dae000 CR4: 0000000000350ee0
[Sa Mär 23 16:32:34 2024] Call Trace:
[Sa Mär 23 16:32:34 2024]
[Sa Mär 23 16:32:34 2024] ? kthread_park+0x85/0xa0
[Sa Mär 23 16:32:34 2024] ? __warn+0x81/0x130
[Sa Mär 23 16:32:34 2024] ? kthread_park+0x85/0xa0
[Sa Mär 23 16:32:34 2024] ? report_bug+0x191/0x1c0
[Sa Mär 23 16:32:34 2024] ? handle_bug+0x41/0x70
[Sa Mär 23 16:32:34 2024] ? exc_invalid_op+0x17/0x70
[Sa Mär 23 16:32:34 2024] ? asm_exc_invalid_op+0x1a/0x20
[Sa Mär 23 16:32:34 2024] ? kthread_park+0x85/0xa0
[Sa Mär 23 16:32:34 2024] mt76u_stop_tx+0x216/0x2f0 [mt76_usb]
[Sa Mär 23 16:32:34 2024] ? __pfx_autoremove_wake_function+0x10/0x10
[Sa Mär 23 16:32:34 2024] mt7921u_mac_reset+0x72/0x1b0 [mt7921u]
[Sa Mär 23 16:32:34 2024] mt7921_mac_reset_work+0x97/0x180 [mt7921_common]
[Sa Mär 23 16:32:34 2024] ? __schedule+0x3cc/0xb50
[Sa Mär 23 16:32:34 2024] process_one_work+0x171/0x340
[Sa Mär 23 16:32:34 2024] worker_thread+0x27b/0x3a0
[Sa Mär 23 16:32:34 2024] ? __pfx_worker_thread+0x10/0x10
[Sa Mär 23 16:32:34 2024] kthread+0xf4/0x130
[Sa Mär 23 16:32:34 2024] ? __pfx_kthread+0x10/0x10
[Sa Mär 23 16:32:34 2024] ret_from_fork+0x31/0x50
[Sa Mär 23 16:32:34 2024] ? __pfx_kthread+0x10/0x10
[Sa Mär 23 16:32:34 2024] ret_from_fork_asm+0x1b/0x30
[Sa Mär 23 16:32:34 2024]
[Sa Mär 23 16:32:34 2024] ---[ end trace 0000000000000000 ]---

@fhteagle
Copy link

fhteagle commented Apr 1, 2024

Also seeing that in journalctl as well:

Mar 28 22:22:52 hostname here kernel: mt7921u 1-1.4:1.3: timed out waiting for pending tx

Seems to be associated with the random failure to forward packets issue I have seen over the past few weeks on mt7921u and mt7921k. Fortunately, this is mitigated by restarting hostapd most of the time, but very rarely I will need to full reboot the machine the mt7921 is attached to, too. I will try to come up with concrete steps to replicate the issue.

@morrownr
Copy link
Owner

morrownr commented Apr 1, 2024

@Simon566 @fhteagle

After seeing the first message I have been keeping an eye open. It is clear that @fhteagle is in AP mode but it is not clear what mode @Simon566 is in. That would be good info to know.

I have seen some errors like this when inadequate or failing power supplies were involved.

@fhteagle

I have a mt7921au based adapter in AP Mode on a Pi4B. It is running RasPiOS with kernel 6.6. I'm not seeing this. Can you go into more detail about:

Seems to be associated with the random failure to forward packets issue I have seen over the past few weeks ...

@Simon566
Copy link
Author

Simon566 commented Apr 3, 2024

Hi ,

its the same here. Im running AP mode on X86-64 hardware, running Debian Bookworm with latest kernel 6.6.23 and hostapd from git mainline. Having a single client doesnt seem to trigger the fault, even when using things like openspeedtest or similar sites. But when i put the AP into my real network with 10 clients , it fails within a day mostly.

Using a rtl8812au AP runs for weeks at least before having some issues, also on USB hardware.

regards,
Simon

@fhteagle
Copy link

fhteagle commented Apr 3, 2024

I cannot give a cause of the issue, nor how to reproduce reliably. Best pattern I can describe now is three general classes of failure, that may or may not be related:

  1. hostapd itself soft bricks, which will lead to even root privilege user cannot get any information from hostapd_cli tool
  2. hostapd process still live, STAs correctly listed in hostapd_cli list_sta, pings get through to STAs, but no other packet type or port makes it to that STA from any other device on LAN, other STAs on same AP may or may not also be affected
  3. hostapd process still live, STAs correctly listed in hostapd_cli list_sta, pings and other port traffic gets through to STAs from IPs on LAN, but no pings or other packet whatsoever are passed between any two STAs connected to the same AP

All 3 failure modes can be cleared with ~95%+ success rate by a simple systemd restart of the hostapd service for that adapter. ~5% of the time I have to reboot to clear the failure mode.

The pending tx journal message is most strongly associated with failure modes 2 and 3, but not perfectly correlated. I can have a mode 2 or mode 3 failure with or without the journal message. With the journal message, there will always be either mode 2 or mode 3 failure.

Again, I'm running bleeding edge hostapd-git , so entirely possible this is the root cause. Behavior seen on arch x86_64 and aarch64 (raspberry pi 3b and 4b) hosts, using MT7921K (pci) and MT7921AU (usb) cards. Adapters are bridged, either via systemd/networkd config, or the bridge = option in hostapd.conf, with one or more ethernet NIC and other adapter device on each host.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants