Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network driver on RPi3 B Plus causing hung tasks when working on an NFS mount #2482

Closed
graysky2 opened this issue Mar 30, 2018 · 113 comments
Closed

Comments

@graysky2
Copy link

Platform/Distro: RPi 3B+ running Arch ARM (armv7h).
Kernel version: 4.14.31 (b36f4e9)
Firmware version: latest as I write this (raspberrypi/firmware@c14a903)

Bug: Frequent kernel oops due to blocked tasks when writing files to NFS mount.

Details: When compiling, dmesg is full of kernel oops like the below when doing so on an NFS mount. Compiling to the micro SD card is fine. I believe that the software (disto) on the micro SD card is NOT to blame... if I put the same micro SD card into a RPi3 or RPi2, I can compile without error.

Again, I am using an NFS mounted partition (/scratch) on which to compile, so I'm hypothesizing that these problems are related to the network driver.

...
[ 2455.534291] INFO: task ld:24879 blocked for more than 120 seconds.
[ 2455.538489]       Tainted: G         C      4.14.31-1-ARCH #1
[ 2455.542688] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2455.550990] ld              D    0 24879  24804 0x00000000
[ 2455.555379] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 2455.559662] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 2455.563990] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 2455.572326] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 2455.580865] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 2455.589272] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 2455.597837] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 2455.606295] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 2455.610675] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 2455.614999] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 2547.695051] nfs: server ease not responding, still trying
[ 2548.735626] nfs: server ease not responding, still trying
[ 2548.768826] nfs: server ease OK
[ 2548.796748] nfs: server ease OK
[ 2701.296329] INFO: task ld:24879 blocked for more than 120 seconds.
[ 2701.300214]       Tainted: G         C      4.14.31-1-ARCH #1
[ 2701.304061] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2701.311642] ld              D    0 24879  24804 0x00000000
[ 2701.315536] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 2701.319458] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 2701.323355] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 2701.330878] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 2701.338447] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 2701.345916] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 2701.353469] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 2701.360953] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 2701.364740] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 2701.368593] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 2772.976750] nfs: server ease not responding, still trying
[ 2774.331264] nfs: server ease OK
[ 2947.057892] INFO: task ld:24879 blocked for more than 120 seconds.
[ 2947.061907]       Tainted: G         C      4.14.31-1-ARCH #1
[ 2947.066031] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2947.074107] ld              D    0 24879  24804 0x00000000
[ 2947.078244] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 2947.081483] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 2947.084348] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 2947.090033] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 2947.095898] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 2947.101751] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 2947.107513] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 2947.113352] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 2947.116350] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 2947.119289] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 2998.258064] nfs: server ease not responding, still trying
[ 2999.352463] nfs: server ease OK
[ 3192.819075] INFO: task ld:24879 blocked for more than 120 seconds.
[ 3192.823185]       Tainted: G         C      4.14.31-1-ARCH #1
[ 3192.827330] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3192.835447] ld              D    0 24879  24804 0x00000000
[ 3192.839604] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 3192.842832] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 3192.845750] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 3192.851476] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 3192.857318] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 3192.863126] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 3192.868837] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 3192.874594] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 3192.877558] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 3192.880466] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 3223.539141] nfs: server ease not responding, still trying
[ 3224.579687] nfs: server ease not responding, still trying
[ 3224.612015] nfs: server ease OK
[ 3224.626000] nfs: server ease OK
[ 3438.580109] INFO: task objcopy:24916 blocked for more than 120 seconds.
[ 3438.583905]       Tainted: G         C      4.14.31-1-ARCH #1
[ 3438.587697] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3438.595231] objcopy         D    0 24916  24912 0x00000000
[ 3438.599109] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 3438.603019] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 3438.606896] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 3438.614435] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 3438.622018] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 3438.629666] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 3438.637259] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 3438.644894] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 3438.648704] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 3438.652599] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 3448.820081] nfs: server ease not responding, still trying
[ 3450.148878] nfs: server ease OK
[ 3674.100906] nfs: server ease not responding, still trying
[ 3675.141506] nfs: server ease not responding, still trying
[ 3675.174279] nfs: server ease OK
[ 3675.202048] nfs: server ease OK
[ 3807.221430] INFO: task objcopy:24916 blocked for more than 120 seconds.
[ 3807.225253]       Tainted: G         C      4.14.31-1-ARCH #1
[ 3807.229007] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3807.236459] objcopy         D    0 24916  24912 0x00000000
[ 3807.240428] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 3807.244393] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 3807.248202] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 3807.255540] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 3807.263030] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 3807.270494] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 3807.277992] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 3807.285364] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 3807.289292] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 3807.293169] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 3899.381659] nfs: server ease not responding, still trying
[ 3900.422241] nfs: server ease not responding, still trying
[ 3900.461112] nfs: server ease OK
[ 3900.474540] nfs: server ease OK
[ 4011.372575] nf_conntrack: default automatic helper assignment has been turned off for security reasons and CT-based  firewall rule not found. Use the iptables CT target to attach helpers instead.
[ 4052.982250] INFO: task as:25088 blocked for more than 120 seconds.
[ 4052.986324]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4052.990389] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4052.998504] as              D    0 25088  25086 0x00000000
[ 4053.002785] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4053.006065] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4053.008960] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4053.014564] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4053.020330] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 4053.026110] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 4053.031705] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 4053.037527] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 4053.040507] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 4053.043431] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4134.902727] nfs: server ease not responding, still trying
[ 4135.997194] nfs: server ease OK
[ 4529.145918] nfs: server ease not responding, still trying
[ 4529.145923] nfs: server ease not responding, still trying
[ 4529.145940] nfs: server ease not responding, still trying
[ 4529.145978] nfs: server ease not responding, still trying
[ 4529.146011] nfs: server ease not responding, still trying
[ 4529.146028] nfs: server ease not responding, still trying
[ 4529.146044] nfs: server ease not responding, still trying
[ 4538.105971] nfs: server ease not responding, still trying
[ 4538.109131] nfs: server ease not responding, still trying
[ 4544.506128] INFO: task gcc:2854 blocked for more than 120 seconds.
[ 4544.509193]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4544.512157] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4544.517957] gcc             D    0  2854   2852 0x00000000
[ 4544.520871] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4544.523830] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4544.526762] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4544.530980] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4544.534883] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 4544.538880] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 4544.542873] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 4544.546949] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 4544.549173] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 4544.551445] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4571.406855] nfs: server ease OK
[ 4571.406996] nfs: server ease OK
[ 4571.407031] nfs: server ease OK
[ 4571.407691] nfs: server ease OK
[ 4571.407701] nfs: server ease OK
[ 4571.410844] nfs: server ease OK
[ 4571.410877] nfs: server ease OK
[ 4571.411761] nfs: server ease OK
[ 4571.411810] nfs: server ease OK
[ 4790.267644] INFO: task ld:7630 blocked for more than 120 seconds.
[ 4790.270597]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4790.273588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4790.279563] ld              D    0  7630   7628 0x00000000
[ 4790.282558] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4790.285531] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4790.288488] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4790.294136] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4790.299855] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 4790.305556] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 4790.311366] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 4790.317112] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 4790.320380] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 4790.323699] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4790.330500] INFO: task ld:7636 blocked for more than 120 seconds.
[ 4790.334181]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4790.338097] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4790.346223] ld              D    0  7636   7633 0x00000000
[ 4790.350304] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4790.354463] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4790.358593] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4790.366494] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4790.374744] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 4790.383021] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 4790.391236] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 4790.399371] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 4790.403607] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 4790.407831] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
@graysky2 graysky2 changed the title Network driver on RPi3 BPlus causing hung tasks when working on an NFS mount Network driver on RPi3 B Plus causing hung tasks when working on an NFS mount Mar 30, 2018
@graysky2
Copy link
Author

graysky2 commented Mar 30, 2018

An easy way to trigger this bug (if you don't want to try compiling the kernel package) is to simply use dd to write out from /dev/zero to the NFS mount. For example on my RPi3 B+:

# mount ease:/scratch /scratch-nfs
% dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=1000 status=progress
964689920 bytes (965 MB, 920 MiB) copied, 149 s, 6.5 MB/s

<<< it froze up after about 965 MB written >>>
<<< In dmesg I get another server not responding error >>>

[ 5112.824818] nfs: server ease not responding, still trying
[ 5149.707808] nfs: server ease OK

Now, if I swap out the micro SD and boot into a RPi 2 I have lying around, same network cable, same power supply, and repeat the commands, everything works as expected. I think that helps to rule out the NFS server, network hardware etc. as potentially to blame.

# mount ease:/scratch /scratch-nfs
% dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=1000 status=progress
4194304000 bytes (4.2 GB, 3.9 GiB) copied, 346 s, 12.1 MB/s
1000+0 records in
1000+0 records out
4194304000 bytes (4.2 GB, 3.9 GiB) copied, 357.595 s, 11.7 MB/s
dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=1000 status=progress  0.00s user 24.47s system 5% cpu 8:03.99 total

@pelwell
Copy link
Contributor

pelwell commented Mar 30, 2018

Does disabling Energy Efficient Ethernet make a difference? Add dtparam=eee=off to config.txt and reboot.

But before trying that you can confirm whether EEE is active using ethtool --show-eee eth0.

@graysky2
Copy link
Author

graysky2 commented Mar 30, 2018

Great suggestion, @pelwell. I got some very encouraging results using the dd test which floods the I/O with a steady stream of data. It "passed" meaning no timeouts writing and no server not responding messages via dmesg. I am compiling the same package that consistently gives the errors now and will post back with those results.

Before:

# ethtool --show-eee eth0
EEE Settings for eth0:
	EEE status: enabled - active
...

After:

# ethtool --show-eee eth0
EEE Settings for eth0:
	EEE status: disabled
...

The test with dd:

# mount ease:/scratch /scratch-nfs

% dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=1000 status=progress && rm fill
4169138176 bytes (4.2 GB, 3.9 GiB) copied, 97 s, 42.9 MB/s 
1000+0 records in
1000+0 records out
4194304000 bytes (4.2 GB, 3.9 GiB) copied, 100.665 s, 41.7 MB/s
dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=1000 status=progress  0.00s user 13.79s system 13% cpu 1:40.68 total

% dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=2000 status=progress && rm fill
8380219392 bytes (8.4 GB, 7.8 GiB) copied, 198 s, 42.3 MB/s
2000+0 records in
2000+0 records out
8388608000 bytes (8.4 GB, 7.8 GiB) copied, 201.245 s, 41.7 MB/s
dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=2000 status=progress  0.00s user 27.98s system 13% cpu 3:21.25 total

% dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=2000 status=progress && rm fill
8380219392 bytes (8.4 GB, 7.8 GiB) copied, 198 s, 42.3 MB/s
2000+0 records in
2000+0 records out
8388608000 bytes (8.4 GB, 7.8 GiB) copied, 201.052 s, 41.7 MB/s
dd if=/dev/zero of=/scratch-nfs/fill bs=4M count=2000 status=progress  0.00s user 28.23s system 13% cpu 3:22.19 total

Unfortunately, when compiling which as you can reconize, writes out data must less frequently than dd does, I am experiencing the same errors:

[ 3315.685473] INFO: task gzip:29769 blocked for more than 120 seconds.
[ 3315.685636]       Tainted: G         C      4.14.31-1-ARCH #1
[ 3315.685767] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3315.685955] gzip            D    0 29769  29767 0x00000000
[ 3315.686127] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 3315.686299] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 3315.686473] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 3315.686663] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 3315.686875] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 3315.687121] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 3315.687349] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 3315.687529] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 3315.691478] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 3315.695540] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 3402.725251] nfs: server ease not responding, still trying
[ 3403.765783] nfs: server ease not responding, still trying
[ 3404.089089] nfs: server ease OK
[ 3404.089297] nfs: server ease OK
[ 3899.364008] nfs: server ease not responding, still trying
[ 3899.364013] nfs: server ease not responding, still trying
[ 3899.364028] nfs: server ease not responding, still trying
[ 3899.364060] nfs: server ease not responding, still trying
[ 3899.364071] nfs: server ease not responding, still trying
[ 3899.364076] nfs: server ease not responding, still trying
[ 3930.084023] INFO: task ld:13616 blocked for more than 120 seconds.
[ 3930.087086]       Tainted: G         C      4.14.31-1-ARCH #1
[ 3930.090229] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3930.096312] ld              D    0 13616  13612 0x00000000
[ 3930.099422] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 3930.102523] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 3930.105566] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 3930.111351] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 3930.117264] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 3930.123049] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 3930.129036] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 3930.135044] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 3930.138283] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 3930.141618] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 3941.625186] nfs: server ease OK
[ 3941.625295] nfs: server ease OK
[ 3941.625441] nfs: server ease OK
[ 3941.625829] nfs: server ease OK
[ 3941.635332] nfs: server ease OK
[ 3941.635549] nfs: server ease OK
[ 4170.727338] nfs: server ease not responding, still trying
[ 4170.727343] nfs: server ease not responding, still trying
[ 4170.727356] nfs: server ease not responding, still trying
[ 4170.727395] nfs: server ease not responding, still trying
[ 4170.727413] nfs: server ease not responding, still trying
[ 4170.727428] nfs: server ease not responding, still trying
[ 4170.727441] nfs: server ease not responding, still trying
[ 4170.727455] nfs: server ease not responding, still trying
[ 4170.727461] nfs: server ease not responding, still trying
[ 4170.727467] nfs: server ease not responding, still trying
[ 4175.847588] INFO: task gzip:22430 blocked for more than 120 seconds.
[ 4175.849590]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4175.851594] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4175.855516] gzip            D    0 22430  22391 0x00000000
[ 4175.857549] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4175.859576] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4175.861543] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4175.865280] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4175.869533] [<80230d0c>] (__filemap_fdatawait_range) from [<80230d70>] (filemap_fdatawait_range+0x18/0x28)
[ 4175.874352] [<80230d70>] (filemap_fdatawait_range) from [<802330f4>] (filemap_write_and_wait+0x58/0x7c)
[ 4175.879764] [<802330f4>] (filemap_write_and_wait) from [<803ea028>] (nfs_wb_all+0x14/0x15c)
[ 4175.885618] [<803ea028>] (nfs_wb_all) from [<803dd96c>] (nfs_setattr+0x280/0x2a4)
[ 4175.892223] [<803dd96c>] (nfs_setattr) from [<802bf8d4>] (notify_change+0x17c/0x410)
[ 4175.899511] [<802bf8d4>] (notify_change) from [<802d62fc>] (utimes_common+0xbc/0x188)
[ 4175.907605] [<802d62fc>] (utimes_common) from [<802d64c8>] (do_utimes+0x100/0x144)
[ 4175.916359] [<802d64c8>] (do_utimes) from [<802d6548>] (SyS_utimensat+0x3c/0xb0)
[ 4175.925462] [<802d6548>] (SyS_utimensat) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4175.934598] INFO: task cp:22444 blocked for more than 120 seconds.
[ 4175.939378]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4175.944179] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4175.953867] cp              D    0 22444  22422 0x00000000
[ 4175.958731] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4175.963485] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4175.968194] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4175.977469] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4175.986837] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[ 4175.996147] [<80233190>] (filemap_write_and_wait_range) from [<803db1c4>] (nfs_file_fsync+0x30/0x280)
[ 4176.005282] [<803db1c4>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[ 4176.014199] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[ 4176.018723] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[ 4176.023212] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4176.031736] INFO: task gzip:22446 blocked for more than 120 seconds.
[ 4176.036150]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4176.040488] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4176.048923] gzip            D    0 22446  22413 0x00000000
[ 4176.053120] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4176.057319] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4176.061492] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4176.069486] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4176.077824] [<80230d0c>] (__filemap_fdatawait_range) from [<80230d70>] (filemap_fdatawait_range+0x18/0x28)
[ 4176.086158] [<80230d70>] (filemap_fdatawait_range) from [<802330f4>] (filemap_write_and_wait+0x58/0x7c)
[ 4176.094434] [<802330f4>] (filemap_write_and_wait) from [<803ea028>] (nfs_wb_all+0x14/0x15c)
[ 4176.102722] [<803ea028>] (nfs_wb_all) from [<803dd96c>] (nfs_setattr+0x280/0x2a4)
[ 4176.111286] [<803dd96c>] (nfs_setattr) from [<802bf8d4>] (notify_change+0x17c/0x410)
[ 4176.119906] [<802bf8d4>] (notify_change) from [<802d62fc>] (utimes_common+0xbc/0x188)
[ 4176.128677] [<802d62fc>] (utimes_common) from [<802d64c8>] (do_utimes+0x100/0x144)
[ 4176.137604] [<802d64c8>] (do_utimes) from [<802d6548>] (SyS_utimensat+0x3c/0xb0)
[ 4176.146598] [<802d6548>] (SyS_utimensat) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4176.155652] INFO: task gzip:22448 blocked for more than 120 seconds.
[ 4176.160374]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4176.165034] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4176.174526] gzip            D    0 22448  22399 0x00000000
[ 4176.179330] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4176.183995] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4176.188703] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4176.197969] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4176.207204] [<80230d0c>] (__filemap_fdatawait_range) from [<80230d70>] (filemap_fdatawait_range+0x18/0x28)
[ 4176.216304] [<80230d70>] (filemap_fdatawait_range) from [<802330f4>] (filemap_write_and_wait+0x58/0x7c)
[ 4176.225376] [<802330f4>] (filemap_write_and_wait) from [<803ea028>] (nfs_wb_all+0x14/0x15c)
[ 4176.234297] [<803ea028>] (nfs_wb_all) from [<803dd96c>] (nfs_setattr+0x280/0x2a4)
[ 4176.243314] [<803dd96c>] (nfs_setattr) from [<802bf8d4>] (notify_change+0x17c/0x410)
[ 4176.252319] [<802bf8d4>] (notify_change) from [<802d62fc>] (utimes_common+0xbc/0x188)
[ 4176.261418] [<802d62fc>] (utimes_common) from [<802d64c8>] (do_utimes+0x100/0x144)
[ 4176.270386] [<802d64c8>] (do_utimes) from [<802d6548>] (SyS_utimensat+0x3c/0xb0)
[ 4176.279426] [<802d6548>] (SyS_utimensat) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4176.288484] INFO: task gzip:22449 blocked for more than 120 seconds.
[ 4176.293202]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4176.297913] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4176.307420] gzip            D    0 22449  22402 0x00000000
[ 4176.312234] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4176.316976] [<80a88018>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[ 4176.321712] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[ 4176.330930] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[ 4176.340209] [<80230d0c>] (__filemap_fdatawait_range) from [<80230d70>] (filemap_fdatawait_range+0x18/0x28)
[ 4176.349341] [<80230d70>] (filemap_fdatawait_range) from [<802330f4>] (filemap_write_and_wait+0x58/0x7c)
[ 4176.358392] [<802330f4>] (filemap_write_and_wait) from [<803ea028>] (nfs_wb_all+0x14/0x15c)
[ 4176.367284] [<803ea028>] (nfs_wb_all) from [<803dd96c>] (nfs_setattr+0x280/0x2a4)
[ 4176.376299] [<803dd96c>] (nfs_setattr) from [<802bf8d4>] (notify_change+0x17c/0x410)
[ 4176.385274] [<802bf8d4>] (notify_change) from [<802d62fc>] (utimes_common+0xbc/0x188)
[ 4176.394337] [<802d62fc>] (utimes_common) from [<802d64c8>] (do_utimes+0x100/0x144)
[ 4176.403276] [<802d64c8>] (do_utimes) from [<802d6548>] (SyS_utimensat+0x3c/0xb0)
[ 4176.412262] [<802d6548>] (SyS_utimensat) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4176.421295] INFO: task gzip:22452 blocked for more than 120 seconds.
[ 4176.425993]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4176.430693] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4176.440149] gzip            D    0 22452  22418 0x00000000
[ 4176.444909] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4176.449585] [<80a88018>] (schedule) from [<80a8b3d4>] (rwsem_down_write_failed+0x12c/0x278)
[ 4176.458924] [<80a8b3d4>] (rwsem_down_write_failed) from [<80a8a6f0>] (down_write+0x58/0x60)
[ 4176.468288] [<80a8a6f0>] (down_write) from [<802afc48>] (path_openat+0x3b0/0x1150)
[ 4176.477766] [<802afc48>] (path_openat) from [<802b1954>] (do_filp_open+0x6c/0xdc)
[ 4176.487122] [<802b1954>] (do_filp_open) from [<8029edc4>] (do_sys_open+0x168/0x20c)
[ 4176.496594] [<8029edc4>] (do_sys_open) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4176.505990] INFO: task mkdir:22457 blocked for more than 120 seconds.
[ 4176.510857]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4176.515599] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4176.525022] mkdir           D    0 22457  22453 0x00000000
[ 4176.529774] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4176.534358] [<80a88018>] (schedule) from [<80a8b3d4>] (rwsem_down_write_failed+0x12c/0x278)
[ 4176.543475] [<80a8b3d4>] (rwsem_down_write_failed) from [<80a8a6f0>] (down_write+0x58/0x60)
[ 4176.552568] [<80a8a6f0>] (down_write) from [<802b1118>] (filename_create+0x70/0x14c)
[ 4176.561851] [<802b1118>] (filename_create) from [<802b1d30>] (SyS_mkdirat+0x4c/0xec)
[ 4176.571295] [<802b1d30>] (SyS_mkdirat) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4176.580876] INFO: task mkdir:22458 blocked for more than 120 seconds.
[ 4176.585741]       Tainted: G         C      4.14.31-1-ARCH #1
[ 4176.590628] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4176.600118] mkdir           D    0 22458  22450 0x00000000
[ 4176.604882] [<80a87848>] (__schedule) from [<80a88018>] (schedule+0x3c/0xa0)
[ 4176.609664] [<80a88018>] (schedule) from [<80a8b3d4>] (rwsem_down_write_failed+0x12c/0x278)
[ 4176.618855] [<80a8b3d4>] (rwsem_down_write_failed) from [<80a8a6f0>] (down_write+0x58/0x60)
[ 4176.628053] [<80a8a6f0>] (down_write) from [<802b1118>] (filename_create+0x70/0x14c)
[ 4176.637238] [<802b1118>] (filename_create) from [<802b1d30>] (SyS_mkdirat+0x4c/0xec)
[ 4176.646634] [<802b1d30>] (SyS_mkdirat) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[ 4211.688544] nfs: server ease not responding, still trying
[ 4212.989190] nfs: server ease OK
[ 4212.989336] nfs: server ease OK
[ 4212.989372] nfs: server ease OK
[ 4212.992652] nfs: server ease OK
[ 4213.002084] nfs: server ease OK
[ 4213.002311] nfs: server ease OK
[ 4213.002416] nfs: server ease OK
[ 4213.017966] nfs: server ease OK
[ 4213.018012] nfs: server ease OK
[ 4213.018632] nfs: server ease OK
[ 4213.020006] nfs: server ease OK
[ 4401.133010] nfs: server ease not responding, still trying
[ 4401.133014] nfs: server ease not responding, still trying
[ 4401.133030] nfs: server ease not responding, still trying
[ 4401.133067] nfs: server ease not responding, still trying
[ 4401.133110] nfs: server ease not responding, still trying
[ 4401.133120] nfs: server ease not responding, still trying
[ 4401.133124] nfs: server ease not responding, still trying
[ 4401.133139] nfs: server ease not responding, still trying
[ 4401.133156] nfs: server ease not responding, still trying
[ 4401.133171] nfs: server ease not responding, still trying
[ 4401.133187] nfs: server ease not responding, still trying
[ 4401.133202] nfs: server ease not responding, still trying
[ 4401.133233] nfs: server ease not responding, still trying
[ 4401.133245] nfs: server ease not responding, still trying
[ 4401.133251] nfs: server ease not responding, still trying
[ 4443.397196] nfs: server ease OK
[ 4443.397213] nfs: server ease OK
[ 4443.397291] nfs: server ease OK
[ 4443.397316] nfs: server ease OK
[ 4443.397343] nfs: server ease OK
[ 4443.397410] nfs: server ease OK
[ 4443.397505] nfs: server ease OK
[ 4443.397580] nfs: server ease OK
[ 4443.397605] nfs: server ease OK
[ 4443.397714] nfs: server ease OK
[ 4443.399097] nfs: server ease OK
[ 4443.405096] nfs: server ease OK
[ 4443.405772] nfs: server ease OK
[ 4443.406117] nfs: server ease OK
[ 4443.406398] nfs: server ease OK
[ 4667.377155] nfs: server ease not responding, still trying
[ 4668.417708] nfs: server ease not responding, still trying
[ 4668.700017] nfs: server ease OK
[ 4668.700524] nfs: server ease OK
[ 4856.819062] nfs: server ease not responding, still trying
[ 4856.819067] nfs: server ease not responding, still trying
[ 4856.819082] nfs: server ease not responding, still trying
[ 4856.819130] nfs: server ease not responding, still trying
[ 4856.819135] nfs: server ease not responding, still trying
[ 4856.819142] nfs: server ease not responding, still trying
[ 4856.819154] nfs: server ease not responding, still trying
[ 4856.819174] nfs: server ease not responding, still trying
[ 4856.819188] nfs: server ease not responding, still trying
[ 4856.819209] nfs: server ease not responding, still trying
[ 4856.819216] nfs: server ease not responding, still trying
[ 4893.959982] nfs: server ease OK
[ 4893.960172] nfs: server ease OK
[ 4893.960210] nfs: server ease OK
[ 4893.960311] nfs: server ease OK
[ 4893.960640] nfs: server ease OK
[ 4893.960770] nfs: server ease OK
[ 4893.960780] nfs: server ease OK
[ 4893.961280] nfs: server ease OK
[ 4893.966452] nfs: server ease OK
[ 4893.967131] nfs: server ease OK
[ 4893.969369] nfs: server ease OK
[ 5123.060914] nfs: server ease not responding, still trying
[ 5124.101425] nfs: server ease not responding, still trying
[ 5124.376882] nfs: server ease OK
[ 5124.381100] nfs: server ease OK
[ 5353.461931] nfs: server ease not responding, still trying
[ 5354.784753] nfs: server ease OK
[ 5588.982673] nfs: server ease not responding, still trying
[ 5590.077559] nfs: server ease OK
[ 5814.263180] nfs: server ease not responding, still trying
[ 5815.303698] nfs: server ease not responding, still trying
[ 5815.334003] nfs: server ease OK
[ 5815.360538] nfs: server ease OK
[ 6044.663615] nfs: server ease not responding, still trying
[ 6045.721789] nfs: server ease OK
[ 6285.305546] nfs: server ease not responding, still trying
[ 6286.346054] nfs: server ease not responding, still trying
[ 6286.376999] nfs: server ease OK
[ 6286.403761] nfs: server ease OK
[ 6510.587277] nfs: server ease not responding, still trying
[ 6511.627865] nfs: server ease not responding, still trying
[ 6511.674761] nfs: server ease OK
[ 6511.686188] nfs: server ease OK
[ 6735.868562] nfs: server ease not responding, still trying
[ 6736.909076] nfs: server ease not responding, still trying
[ 6736.940771] nfs: server ease OK
[ 6736.967038] nfs: server ease OK
[ 6940.669438] nfs: server ease not responding, still trying
[ 6977.551872] nfs: server ease OK

@graysky2
Copy link
Author

graysky2 commented Mar 31, 2018

I combined a few replies into one (above) and tried to make it it bit more concise. TL;DR version is that disabling EEE does not help.

@pelwell
Copy link
Contributor

pelwell commented Mar 31, 2018

If possible, and if it isn't already on, can you enable flow control on the switch port connected to the Pi?

@graysky2
Copy link
Author

@pelwell - All the wired connections go through an unmanaged switch. No settings to tweak :/

@mkreisl
Copy link

mkreisl commented Mar 31, 2018

Bug: Frequent kernel oops due to blocked tasks when writing files to NFS mount.

I had similar issues on SAMBA mount. But currently I can not run tests again, because I sent back my Pi3B+.

IMO current revision of Pi3B+ has serious hardware issues and I don't believe that they can be solved via software (Finally, I never was able to play a video longer than 15mins without a Kodi crash, kernel Oops, or freeze)

@pelwell and co:
Which Pi3B+ (revision) are you currently using? Parts of 0-series or parts from current production line, which customers are using now.

I still can't believe that you guys never had such issues before

@JamesH65
Copy link
Contributor

JamesH65 commented Mar 31, 2018 via email

@Knoppix1
Copy link

Knoppix1 commented Apr 1, 2018

Perso I back on pi 2b ...
(He boot faster with the same SD card...)

@mkreisl when your pi is back if work normally I send my pi too

@zmartell
Copy link

zmartell commented Apr 4, 2018

+1 I am noticing this issue as well when reading off samba mount. Brand new RPI 3B+.

@graysky2
Copy link
Author

graysky2 commented Apr 4, 2018

@pelwell - From @popcornmix's advice in #2442, I built:

I automated that dd test I described above in a simple script that repeats the writing out of 1G worth of zero filled file over an NFS share 32 times. I then used histogram.py to compute the stats.

With the dtparam=eee=off parameter set in /boot/config.txt I got some consistent results:

% histogram.py -p < results_no_eee.csv
# NumSamples = 32; Min = 25.46; Max = 25.75
# Mean = 25.687864; Variance = 0.002705; SD = 0.052009; Median 25.693114

When I removed that line (reverting to the default state of it being on, 1 of the 32 runs was really long:

% histogram.py -p < results.csv
# NumSamples = 36; Min = 25.34; Max = 139.44
# Mean = 28.763650; Variance = 350.005030; SD = 18.708421; Median 25.599488

Since using dd is going to max out the bus, I will try compiling the kernel which is much more gentle to the network IO and much more prone to errors in my experience. Thoughts?

@graysky2
Copy link
Author

graysky2 commented Apr 5, 2018

OK... still experiencing the timeouts when compiling to the NFS share with eee enabled despite the successful replicates of using dd above. I am currently building c2eb306 and will test it by compiling the kernel to NFS with eee enabled and with it disabled.

For reference, here is the script to automate the replicate compile jobs.

@graysky2
Copy link
Author

graysky2 commented Apr 6, 2018

@pelwell - I am still getting network timeouts... below is with dtparam=eee=off set booted into the latest kernel.

[11786.758187] nfs: server ease not responding, still trying
[11786.758192] nfs: server ease not responding, still trying
[11786.758206] nfs: server ease not responding, still trying
[11786.758225] nfs: server ease not responding, still trying
[11794.438353] INFO: task ld:25967 blocked for more than 120 seconds.
[11794.441599]       Tainted: G         C      4.14.32-2-ARCH #1
[11794.444867] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11794.451496] ld              D    0 25967  25966 0x00000000
[11794.454918] [<80a87c48>] (__schedule) from [<80a88418>] (schedule+0x3c/0xa0)
[11794.458408] [<80a88418>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[11794.461670] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[11794.468043] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[11794.474533] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[11794.481165] [<80233190>] (filemap_write_and_wait_range) from [<803db254>] (nfs_file_fsync+0x30/0x280)
[11794.487571] [<803db254>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[11794.494001] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[11794.497339] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[11794.500698] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[11831.579322] nfs: server ease OK
[11831.579326] nfs: server ease OK
[11831.583067] nfs: server ease OK
[11831.583118] nfs: server ease OK
[12040.199240] INFO: task ld:27693 blocked for more than 120 seconds.
[12040.202836]       Tainted: G         C      4.14.32-2-ARCH #1
[12040.206449] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12040.213627] ld              D    0 27693  27692 0x00000000
[12040.217311] [<80a87c48>] (__schedule) from [<80a88418>] (schedule+0x3c/0xa0)
[12040.220971] [<80a88418>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[12040.223568] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[12040.228677] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[12040.233740] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[12040.238903] [<80233190>] (filemap_write_and_wait_range) from [<803db254>] (nfs_file_fsync+0x30/0x280)
[12040.244189] [<803db254>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[12040.249445] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[12040.252070] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[12040.254713] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)
[12101.639302] nfs: server ease not responding, still trying
[12101.639311] nfs: server ease not responding, still trying
[12101.639328] nfs: server ease not responding, still trying
[12142.599536] nfs: server ease not responding, still trying
[12143.639966] nfs: server ease not responding, still trying
[12143.900616] nfs: server ease OK
[12143.900633] nfs: server ease OK
[12143.909707] nfs: server ease OK
[12143.917548] nfs: server ease OK
[12143.917848] nfs: server ease OK
[12408.840196] nfs: server ease not responding, still trying
[12408.840200] nfs: server ease not responding, still trying
[12408.840228] nfs: server ease not responding, still trying
[12408.840248] nfs: server ease not responding, still trying
[12408.840274] nfs: server ease not responding, still trying
[12408.840412] INFO: task ld:29538 blocked for more than 120 seconds.
[12408.840421]       Tainted: G         C      4.14.32-2-ARCH #1
[12408.840424] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12408.840430] ld              D    0 29538  29537 0x00000000
[12408.840493] [<80a87c48>] (__schedule) from [<80a88418>] (schedule+0x3c/0xa0)
[12408.840514] [<80a88418>] (schedule) from [<8015c138>] (io_schedule+0x14/0x3c)
[12408.840541] [<8015c138>] (io_schedule) from [<80230be4>] (wait_on_page_bit+0x110/0x15c)
[12408.840559] [<80230be4>] (wait_on_page_bit) from [<80230d0c>] (__filemap_fdatawait_range+0xdc/0x128)
[12408.840574] [<80230d0c>] (__filemap_fdatawait_range) from [<80233190>] (filemap_write_and_wait_range+0x54/0x88)
[12408.840596] [<80233190>] (filemap_write_and_wait_range) from [<803db254>] (nfs_file_fsync+0x30/0x280)
[12408.840618] [<803db254>] (nfs_file_fsync) from [<802d5dcc>] (vfs_fsync+0x24/0x2c)
[12408.840635] [<802d5dcc>] (vfs_fsync) from [<8029d758>] (filp_close+0x2c/0x80)
[12408.840648] [<8029d758>] (filp_close) from [<8029d7cc>] (SyS_close+0x20/0x48)
[12408.840665] [<8029d7cc>] (SyS_close) from [<80107ce0>] (ret_fast_syscall+0x0/0x4c)

@mkreisl
Copy link

mkreisl commented Apr 6, 2018

Same here, nothing changed. Still absolutely unstable, unreliable and completely unusable, the Pi3B+

@JamesH65
Copy link
Contributor

JamesH65 commented Apr 6, 2018

Odd, got one on my desk that is working fine. I think you forget to add "In the circumstances I am using it".

Anyway, issues still being looked at both here and at Microchip. There was a patch on the linux netdev list today for this chips driver (lan78xxx) for EEE which may well help, that will need to be tried. It's not like we are just sitting here twiddling our thumbs.

@mkreisl
Copy link

mkreisl commented Apr 6, 2018

Anyway, issues still being looked at both here and at Microchip. There was a patch on the linux netdev list today for this chips driver (lan78xxx) for EEE which may well help, that will need to be tried. It's not like we are just sitting here twiddling our thumbs.

Seems you're getting fire under your a.. now 😄

IMO you're looking at the wrong place. LAN issues are only the top of the iceberg

I was already reporting, that system is still unstable after that dump microchip is powered off and all traffic is going over wlan device. System still freezing randomly. So, before I'm better informed, I would say the hole Pi3B+ design is a huge issue

@pelwell
Copy link
Contributor

pelwell commented Apr 6, 2018

Some users who reported problems (and there honestly haven't been that many, but they are shouting loudly) have had success with adding sdram_freq=450 to config.txt. I would recommend anybody with stability problems (anything not obviously network related) to do the same.

@mkreisl
Copy link

mkreisl commented Apr 6, 2018

Some users who reported problems (and there honestly haven't been that many, but they are shouting loudly) have had success with adding sdram_freq=450 to config.txt. I would recommend anybody with stability problems (anything not obviously network related) to do the same.

What's the default for Pi3B+. Cant find it here

@pelwell
Copy link
Contributor

pelwell commented Apr 6, 2018

500 turbo, 400 normal

@graysky2
Copy link
Author

graysky2 commented Apr 6, 2018

For reference, here is the script to automate the replicate compile jobs.

@pelwell - I have some hard data now. I ran the make benchmark writing out to the NFS share under 2 conditions, once with eee disabled and once with it enabled. There is a clear trend: eee is causing problems.

Running make zImage

Here are 9 or 10 replicates running make zImage with all times reported in minutes.

% histogram.py < eee_on_zimage 
# NumSamples = 9; Min = 9.77; Max = 29.07
# Mean = 18.573025; Variance = 86.102905; SD = 9.279165; Median 10.764777

vs

% histogram.py < eee_off_zimage
# NumSamples = 10; Min = 9.91; Max = 10.87
# Mean = 10.178291; Variance = 0.067048; SD = 0.258936; Median 10.166035

Several trends from these data:

  • Average time to compile is nearly double with eee enabled.
  • Standard deviation and variance is much worse with eee enabled (more unpredictable compile times).
  • Of the replicates, the longest compile time was observed with eee enabled and was nearly tripled.

Running make modules

Here are 9 or 10 replicates running make modules with all times reported in minutes.

% histogram.py < eee_on_modules 
# NumSamples = 9; Min = 25.21; Max = 67.19
# Mean = 51.765753; Variance = 218.212739; SD = 14.772026; Median 46.494882

vs

% histogram.py < eee_off_modules
# NumSamples = 9; Min = 26.33; Max = 49.60
# Mean = 33.328529; Variance = 42.429103; SD = 6.513763; Median 32.126122

The same trends from these data:

  • Average time to compile is about 1.5x longer with eee enabled.
  • Standard deviation and variance is much worse with eee enabled (more unpredictable compile times).
  • Of the replicates, the longest compile time was observed with eee enabled and was about 33% longer.

I am happy to test future patches/firmware, whatever to help optimize this. I think the make zImage benchmark will be sufficient for this since it's way faster than make modules and gives similar results. Just let me know.

EDIT: I see @popcornmix pushed raspberrypi/firmware@3aa8060 a few hours ago... time to retest?

@graysky2
Copy link
Author

graysky2 commented Apr 6, 2018

@mkreisl - Please keep this issue on task... it's scoped for network writes not for general stability. Open a new task for that.

@mkreisl
Copy link

mkreisl commented Apr 6, 2018

@graysky2 Oops, sorry for tainting your thread

@graysky2
Copy link
Author

graysky2 commented Apr 6, 2018

A potential work-around: don't totally disable EEE, but set dtparam=tx_lpi_timer=10000 in /boot/config.txt which I did and found nearly identical results in the make zImage benchmark to having EEE totally disabled.

Again, values reported are compile times in minutes.

dtparam=tx_lpi_timer=10000

# NumSamples = 12; Min = 9.90; Max = 10.19
# Mean = 10.089245; Variance = 0.007412; SD = 0.086094; Median 10.119596

dtparam=eee=off

# NumSamples = 10; Min = 9.91; Max = 10.87
# Mean = 10.178291; Variance = 0.067048; SD = 0.258936; Median 10.166035

EDIT: see #2482 (comment) which demonstrates that the problem is still present.

@mkreisl
Copy link

mkreisl commented Apr 6, 2018

All those EEE settings doesn't help for me, because my router/switch does not support EEE (most router with integrated switch does not support it) and I'm still getting nfs timeouts even if EEE is completely disabled, or I'm getting

Apr  6 16:11:14 kmxbilr2 kernel: [  837.345227] CIFS VFS: sends on sock aa2921c0 stuck for 15 seconds
Apr  6 16:11:14 kmxbilr2 kernel: [  837.345261] CIFS VFS: Error -11 sending data on socket to server
Apr  6 16:11:30 kmxbilr2 kernel: [  852.705497] CIFS VFS: sends on sock aa2921c0 stuck for 15 seconds
Apr  6 16:11:30 kmxbilr2 kernel: [  852.705532] CIFS VFS: Error -11 sending data on socket to server
Apr  6 16:11:30 kmxbilr2 kernel: [  852.833704] CIFS VFS: Free previous auth_key.response = 99685c00
Apr  6 16:11:55 kmxbilr2 kernel: [  878.305932] CIFS VFS: sends on sock aa29c380 stuck for 15 seconds
Apr  6 16:11:55 kmxbilr2 kernel: [  878.305972] CIFS VFS: Error -11 sending data on socket to server
Apr  6 16:12:11 kmxbilr2 kernel: [  893.666123] CIFS VFS: sends on sock aa29c380 stuck for 15 seconds
Apr  6 16:12:11 kmxbilr2 kernel: [  893.666156] CIFS VFS: Error -11 sending data on socket to server
Apr  6 16:12:26 kmxbilr2 kernel: [  909.026351] CIFS VFS: sends on sock aa29c380 stuck for 15 seconds
Apr  6 16:12:26 kmxbilr2 kernel: [  909.026382] CIFS VFS: Error -11 sending data on socket to server
Apr  6 16:12:41 kmxbilr2 kernel: [  924.386541] CIFS VFS: sends on sock aa29c380 stuck for 15 seconds
Apr  6 16:12:41 kmxbilr2 kernel: [  924.386573] CIFS VFS: Error -11 sending data on socket to server
Apr  6 16:12:41 kmxbilr2 kernel: [  924.484318] CIFS VFS: Free previous auth_key.response = a7910f00

if using SAMBA mount instead of NFS mount and after some time process that writes to share stucks and becomes uninterruptable 'D' state forever

@graysky2
Copy link
Author

graysky2 commented Apr 6, 2018

@mkreisl - Are you booted into the same kernel and are you using the same firmware commit that I am?
Kernel: c2eb306
Firmware: raspberrypi/firmware@0dff9ec

@mkreisl
Copy link

mkreisl commented Apr 6, 2018

@graysky2
Kernel: yes (XBian built based on bcm2709_defconfig)
Firmware: yes, excatly the same version

@graysky2
Copy link
Author

graysky2 commented Apr 6, 2018

@mkreisl - not sure what to say then.... perhaps you have a different issue. As a control, have you tried the same stuff with another older RPi? Like a 2 or 3?

@mkreisl
Copy link

mkreisl commented Apr 7, 2018

@graysky2 Sure, I'm running same procedure on Pi1, 2 and 3 (without +) since years without any problem.

@mkreisl
Copy link

mkreisl commented Apr 7, 2018

@graysky2 In short words, I can explain what it does

  1. mount network share (sshfs, nfs or samba)
  2. create image on this share, big enough to backup data from root/boot fs into it
  3. create partition in image (vfat for boot, btrfs for root)
  4. copy boot partition into mounted image (loop device)
  5. copy all subvolumes into mounted image (using btrfs send/receive or tar, both tested)
  6. close everything and umount share

From 1 to 4 it always works, and within 5 it stucks always, but not on the same subvolume
And, it does not matter if source fs is on sd, usb disk or iSCSI target

popcornmix pushed a commit that referenced this issue Mar 27, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Mar 27, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 3, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 5, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 5, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 8, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 11, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 16, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 16, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 18, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 23, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 29, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 29, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
jai-raptee pushed a commit to jai-raptee/iliteck1 that referenced this issue Apr 30, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

raspberrypi/linux#2449
raspberrypi/linux#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
jai-raptee pushed a commit to jai-raptee/iliteck1 that referenced this issue Apr 30, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

raspberrypi/linux#2449
raspberrypi/linux#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
jai-raptee pushed a commit to jai-raptee/iliteck1 that referenced this issue Apr 30, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

raspberrypi/linux#2449
raspberrypi/linux#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
jai-raptee pushed a commit to jai-raptee/iliteck1 that referenced this issue Apr 30, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

raspberrypi/linux#2449
raspberrypi/linux#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
jai-raptee pushed a commit to jai-raptee/iliteck1 that referenced this issue Apr 30, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

raspberrypi/linux#2449
raspberrypi/linux#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
jai-raptee pushed a commit to jai-raptee/iliteck1 that referenced this issue Apr 30, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

raspberrypi/linux#2449
raspberrypi/linux#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
jai-raptee pushed a commit to jai-raptee/iliteck1 that referenced this issue Apr 30, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

raspberrypi/linux#2449
raspberrypi/linux#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 2, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 13, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 20, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 20, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 28, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 28, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Jun 3, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Jun 3, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Jun 3, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Jun 10, 2024
TSO seems to be having issues when packets are dropped and the
remote end uses Selective Acknowledge (SACK) to denote that
data is missing. The missing data is never resent, so the
connection eventually stalls.

There is a module parameter of enable_tso added to allow
further debugging without forcing a rebuild of the kernel.

#2449
#2482

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests