New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC, RFT] mvebu: add 6.6 testing kernel #14912
Conversation
CI is failing on the DTB move, but that's a given since 6.1 has the DTSes still in the main dts/ dir, and CI is compiling 6.1. Not sure how to have the CI compile 6.6 short of switching kernel version already. I now realise moving the DTSes to marvell/ like upstream does breaks 6.1, but I am not aware of a workaround. Either we move the DTSes to the vendor subdir, and we need to make minimal edits to e.g. Makefiles, or we keep our downstream DTSes in the DTS top dir, but need to edit their includes to point to the marvell/ subdir. Both approaches will break 6.1 support. |
Take a look at this: #14713 |
DNWFM with configdiff, fail is:
symtab.h extent, but zero length... |
Tested and running successfully on GL-MV1000 (initramfs) altough with some emitted warnings related to DTS |
@anomeome I saw that happen too, a |
Can you share the symbols you had to set? So I can add them to the cortexa53 config. |
@Borromini , Yes, that is where I started but no success. |
CRYPTO_SM4_ARM64_CE_CCM |
I will add them to the A53 config, thanks. @robimarko Would you mind reviewing this? I'm curious about your take on the DTS vendor subdir issue as well. Thanks. |
From: Felix Fietkau <nbd@nbd.name> | ||
Subject: mvneta: tx queue workaround |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need still this hacky patch for kernel 6.6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reference 5411
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That reference is in the patch description, but it does not answer my question. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get some empirical data. I repeated my tests from 6.1 bump (#12938 (comment)).
Test device:
Linksys WRT1900ACS (mvebu/cortexa9)
OpenWrt SNAPSHOT, r0+25621-231d84e7a8 (f7c732b with this PR on top)
Kernel 6.6.22
netperf 2.7.0-r3
Test client:
x86_64 Fedora 39 workstation
intel I219-V GBit Ethernet directly connected
netperf 2.7.1
Test commands
netperf -H 172.16.101.1 -l 60 -t TCP_MAERTS &
netperf -H 172.16.101.1 -l 60 -t TCP_MAERTS &
netperf -H 172.16.101.1 -l 60 -t TCP_MAERTS &
netperf -H 172.16.101.1 -l 60 -t TCP_MAERTS &
Results with patch 700
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 60.00 234.15
131072 16384 16384 60.00 234.14
131072 16384 16384 60.00 234.14
131072 16384 16384 60.00 234.14
936,57 (total)
Results without patch 700
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 60.00 888.79
131072 16384 16384 60.01 15.95
131072 16384 16384 60.01 15.95
131072 16384 16384 60.01 15.92
936,61 (total)
Results without patch 700, and packet steering enabled
config globals 'globals'
option packet_steering '1'
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 60.00 920.67
131072 16384 16384 60.01 8.01
131072 16384 16384 60.01 8.01
131072 16384 16384 60.01 7.92
944,61 (total)
Results without patch 700, and hardware flow offloading
config defaults
option flow_offloading '1'
option flow_offloading_hw '1'
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 60.02 0.00
131072 16384 16384 60.00 936.53
131072 16384 16384 60.02 0.01
131072 16384 16384 60.02 0.01
936,55 (total)
Results without patch 700, and software flow offloading
config defaults
option flow_offloading '1'
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 60.00 339.62
131072 16384 16384 60.01 0.02
131072 16384 16384 60.00 209.14
131072 16384 16384 60.01 387.74
936,52 (total)
(all results repeatable with minor deviations)
Conclusion
Sadly the answer is "Yes", patch seems to be still necessary.
Results with sw flow offloading are interesting though 🤔
Btw. build & run tested on WRT1900ACS 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it would be interesting to see if it is even broken by using vanilla kernel, but I think that upstream Linux kernel developers would notice that. Does anyone try to reach them? Isn't this caused for us by another hacky patch in this repository?
BTW: Thanks for testing this! :) Appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is software flow offload is working with 6.6.x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is software flow offload is working with 6.6.x?
From what I've gathered, it does not on devices that use the switch for both LAN and WAN, but it does for e.g. Turris Omnia which has a separate WAN connection. It is working on my Turris Omnia - with the usual quirks.
EDIT: But if it IS working on WRT1900ACS, then I don't know...
@@ -42,6 +42,9 @@ CONFIG_CC_HAVE_SHADOW_CALL_STACK=y | |||
CONFIG_CC_HAVE_STACKPROTECTOR_SYSREG=y | |||
CONFIG_CPU_LITTLE_ENDIAN=y | |||
CONFIG_CRC_CCITT=y | |||
CONFIG_CRYPTO_AES_ARM64_CE_CCM=y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should probably be moved to their respective kmod-crypto
@Borromini For the DTS path change, see how mediatek solved it |
So far seems to work well on Turris Omnia (cortexa9/armv7). |
@robimarko Thanks for the hint. I have added the if clause, which works for the upstream DTSes, but that would make the buildroot think our OpenWrt specific DTSes were under If that's okay, I'll push those changes to the PR. Haven't had the time to look into the kmod-crypto stuff yet. @anomeome While testing I ran into the symtab.h issue again as well, it looks like some config option is causing it. Using default buildroot options (ie wipe config, just select a single or multiple device(s) and don't touch anything else besides ticking the testing kernel), it compiles just fine. Hopefully that can help troubleshoot. |
@Borromini No, that would break 6.1 kernel, just make |
I compile-tested both 6.1 and 6.6 and it won't. I can link the patch this evening for testing. A DTS dir seems cleaner to me than files-6.1 and files-6.6, IMHO? |
dts was traditionally used when there was no upstream dts directory for that vendor/platform, files is more standard |
Got it. Will do the files-6.1/6.6 split then. |
Yep. Compiled all the wrtpac targets, flashed and running on a rango.; I saw a post on the forum where someone said they had to go back to OEM to move forward with their 6.6.x build, I did not see this issue. Edit: re. mamba kernel partition 4MB limit: |
PR #14952 by @robimarko adds missing symbols. |
Thanks, tree did show indeed lots of other ARM targets already manipulated those symbols in their config files. PR is refreshed, using files-6.{1,6} as requested. @anomeome Not sure how to tackle that (and if it should be part of this PR), does the bootloader support compression? Can we compress the kernel? A cursory look at the makefile seems to suggest the kernel is not compressed (or at least not with LZMA) |
Run-tested on WRT1900ACS ( |
@Borromini , just a data point that mamba is again close to limit (~72K), I would guess it will last until next major kernel push. |
I've come to realise all changes thrown together in one single commit is a bit annoying if troubleshooting is needed, any objections if I refresh this PR with the changes split out? |
On Mon, Mar 25, 2024 at 01:51:13AM -0700, Borromini wrote:
> Regarding the dmesg output I sent, do you think we have to worry about the error messages related to phylink, if I am correct?
Can you be more specific? I don't see any issues at first sight there.
Here is a selection of messages from my side, some of these are clearly not related to the PR or not problematic, just decided to leave them
[ 0.000000] OF: fdt: Reserved memory: failed to reserve memory for node ***@***.***': base 0x0000000004400000, size 16 MiB
[ 0.556411] debugfs: Directory 'd0060900.xor' with parent 'dmaengine' already present!
[ 0.624646] OF: Bad cell count for ***@***.******@***.******@***.***/partitions
[ 0.657280] mv88e6085 d0032004.mdio-mii:01: switch 0x3400 detected: Marvell 88E6141, revision 0
[ 0.679560] mvneta d0030000.ethernet eth0: Using device tree mac address 94:83:c4:03:ae:ad
[ 0.896580] mv88e6085 d0032004.mdio-mii:01: switch 0x3400 detected: Marvell 88E6141, revision 0
[ 0.927088] hwmon hwmon0: temp1_input not attached to any thermal zone
[ 0.945722] hwmon hwmon1: temp1_input not attached to any thermal zone
[ 0.964033] hwmon hwmon2: temp1_input not attached to any thermal zone
[ 1.269140] mv88e6085 d0032004.mdio-mii:01: OF node ***@***.******@***.******@***.******@***.*** of CPU port 0 lacks the required "phy-mode" property
[ 1.284064] mv88e6085 d0032004.mdio-mii:01: OF node ***@***.******@***.******@***.******@***.*** of CPU port 0 lacks the required "phy-handle", "fixed-link" or "managed" properties
[ 1.301657] mv88e6085 d0032004.mdio-mii:01: Skipping phylink registration for CPU port 0
[ 1.389087] mv88e6085 d0032004.mdio-mii:01 wan (uninitialized): PHY ***@***.******@***.******@***.***!mdio:11] driver [Marvell 88E6341 Family] (irq=54)
[ 1.488768] mv88e6085 d0032004.mdio-mii:01 lan0 (uninitialized): PHY ***@***.******@***.******@***.***!mdio:12] driver [Marvell 88E6341 Family] (irq=55)
[ 1.588747] mv88e6085 d0032004.mdio-mii:01 lan1 (uninitialized): PHY ***@***.******@***.******@***.***!mdio:13] driver [Marvell 88E6341 Family] (irq=56)
[ 1.699922] turris-mox-rwtm firmware:armada-3700-rwtm: Cannot read board information: -5
[ 1.708039] armada-37xx-rwtm-mailbox d00b0000.mailbox: Secure processor not ready
[ 1.715691] turris-mox-rwtm firmware:armada-3700-rwtm: Firmware does not support the GET_RANDOM command
[ 1.725146] turris-mox-rwtm: probe of firmware:armada-3700-rwtm failed with error -5
[ 2.103434] gpio_button_hotplug: loading out-of-tree module taints kernel.
…
--
Reply to this email directly or view it on GitHub:
#14912 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
|
Those warnings are quite clear, that device DTS was not done according to the bindings and they just added validation |
On Mon, Mar 25, 2024 at 02:27:00AM -0700, Robert Marko wrote:
Those warnings are quite clear, that device DTS was not done according to the bindings and they just added validation
Got it, thanks!
…
--
Reply to this email directly or view it on GitHub:
#14912 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
|
Please no commits without the accompanying description |
Sorry, should be okay now. |
Looks like there are still a few commits without description, e.g.
If those
But that's more or less personal style, feel free to write a descriptive line for every one. |
Thanks, that makes more sense. Done. |
You still have a commit without desciption, CI is falling as well on it due to formal issues |
e38d8c9
to
028dfb2
Compare
Well that's the sleep deprivation I guess :|. Should be good now. |
Tested PR in it's current state in the GL-MV1000 after rebasing in main (GCC 13). Works fine. |
Build and run tested again on WRT1900ACS and Turris Omnia (both mvebu/cortexa9) with some custom configuraiton, dual MT7915e WiFi, EP06 LTE modem and WireGuard. Rebase onto 0c96d20. No regressions so far. |
@robimarko I think the comments have been addressed, can this be merged? |
DTS paths for 32 bit ARM devices changed with 6.6, move files/ to files-6.1 to prep for kernel 6.6 introduction. Signed-off-by: Stijn Segers <foss@volatilesystems.org>
Copy all mvebu 6.1 specific files, patches and configs to 6.1. Signed-off-by: Stijn Segers <foss@volatilesystems.org>
As of 6.6, all upstream DTSes are moved to their respective vendor subdir. OpenWrt already followed this practice for ARM64, but not yet for 32 bit ARM (Armada 37x/38x). Signed-off-by: Stijn Segers <foss@volatilesystems.org>
000-cpufreq-armada-8k-add-ap807-support.patch was upstreamed. Signed-off-by: Stijn Segers <foss@volatilesystems.org>
Manually refreshed: * 309-linksys-status-led.patch * 310-linksys-use-eth0-as-cpu-port.patch * 320-arm-dts-armada-370-synology-ds213j-mtd-parts.patch * 701-mvpp2-read-mac-address-from-nvmem.patch * 902-drivers-mfd-Add-a-driver-for-IEI-WT61P803-PUZZLE-MCU.patch All other patches automatically refreshed. Signed-off-by: Stijn Segers <foss@volatilesystems.org>
With 6.6, all DTSes were moved to their vendor subdirectories. ARM64 DTSes already used this scheme, but 32 bit Cortex A9 did not, prior to 6.6. Introduce a kernel version check to keep backward compatibility with 6.1. Suggested-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Stijn Segers <foss@volatilesystems.org>
Add 6.6 testing kernel for mvebu target. Signed-off-by: Stijn Segers <foss@volatilesystems.org>
Thanks! Rebased on top of main and merged! |
Some files (e.g. target/linux/mvebu/patches-6.6/300-mvebu-Mangle-bootloader-s-kernel-arguments.patch) require updating to kernel 6.6.23. |
Yeah, already done. |
This adds 6.6 support for mvebu.
Compile-tested:
Run-tested: