Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linux-raspberrypi-4.19.57-2, 4.19.58-1, 4.19.59-1, 4.19.63-1 causes kernel panic on raspberry pi 2 (arch linux arm) #3087

Closed
nightBulb opened this issue Jul 18, 2019 · 76 comments

Comments

@nightBulb
Copy link

nightBulb commented Jul 18, 2019

Describe the bug
Kernel Panic on boot.

To reproduce

  • Update packages using pacman -Syu
    (updating linux-raspberrypi package to 4.19.65-1 (current latest)) ( any version after 4.19.57-1 )
  • Reboot
  • Kernel Panic

Expected behaviour
System to boot / not result in Kernel Panic

Actual behaviour
Kernel Panic

System
raspinfo.txt on gist.github.com
raspinfo.txt

  • Which model of Raspberry Pi? e.g. Pi3B+, PiZeroW
  • Which OS and version (cat /etc/rpi-issue)?
  • Which firmware version (vcgencmd version)?
  • Which kernel version (uname -a)?

Logs
If applicable, add the relevant output from dmesg or similar.


Additional context

Last functional version of linux-raspberrypi package is : 4.19.57-1
Failing Versions: 4.19.57-2, 4.19.58-1, 4.19.59-1, 4.19.63-1, 4.19.65-1 (these versions are tested to fail) (any version missing in between probably is also failing)

Current workarounds are:

Workaround 1:
Part A
  • restore /boot from backup to /boot on memorycard manually
  • boot system with several systemd services failing (no modules for old kernel)
Part B
  • install old kernel pacman -U /var/cache/pacman/pkg/linux-raspberrypi-4.19.57-1-armv7h.pkg.tar.xz
    (You can download this version from this link if you don't have the old package)
    sha1sum: b23f0bdb7befafc2f86a19041df25c8240b5efdc of linux-raspberrypi-4.19.57-1-armv7h.pkg.tar.xz
  • reboot
  • normal functional system
OR
Workaround 2 :

set gpu_mem=256 in /boot/config.txt as suggested by @zertyz in the comment

OR
Workaround 2.5 :
  • Use Workaround 2 above, to make the system bootable
    then,
  • Use Workaround 1 Part B to switch to older kernel, so that you can reclaim the RAM from GPU
    (which is optimal for headless system)
  • set gpu_mem=64 in /boot/config.txt

More Context

Since this may or maynot ( recently reported to affect raspbian too) be an Arch Linux ARM specific issue,
the troubleshooting / bug-reporting / discussion is also happening on Arch Linux ARM forums as well,
on following posts:

Do note that, #3087 i.e. this github issue is (probably) most active.

@nightBulb
Copy link
Author

nightBulb commented Jul 18, 2019

The issue might be related to #3084,
and a vague suspicion that it also relates to #2239

@lategoodbye
Copy link
Contributor

This needs more information:

  • What kernel panic does occur (please a dump of the console or at least a photo)
  • Where is your rootfs located sd card or USB device?

@nightBulb
Copy link
Author

nightBulb commented Jul 20, 2019

The rootfs is located on USB Flash Drive (8 GB),

(Memory Cards (any) were very unreliable for root in case of power failure (ever since purchase) )

I'll try to get photo of kernel panic.

@nightBulb
Copy link
Author

The following is the image of boot log, just before kernel panic

FJIMG_20190716_210738-enh

@nightBulb
Copy link
Author

@lategoodbye The above data is as requested (hopefully),

I currently have held back update:

linux-raspberrypi: ignoring package upgrade (4.19.57-1 => 4.19.58-1)

Because its also the primary server at home, and,
recovering from kernel panics is a lengthy, disruptive process,

however, I can try to intentionally cause kernel panic by updating software,
if needed.

@johnny-cash
Copy link

johnny-cash commented Jul 21, 2019

I have a raspberry 3b and the same thing happens with the kernel 4.19.58-1 in arch linux arm.

I have the system / on a usb disk and it gives me kernel panic. Also, if I install the whole system on an SD card and try to mount USB devices with this version of the kernel, it gives me print_req_error and I / O errors.

With the previous kernel versions works perfect.

@dkadioglu
Copy link

I have a Raspberry Pi 3B+ with Arch Linux on a SD Card. With kernel 4.19.58 my Sundtek USB Tuner is not working anymore. With 4.19.57 everything works as expected. Maybe there is something bad in general with 4.19.58 and USB devices? If you need further info, please ask.

@lategoodbye
Copy link
Contributor

@nightBulb Thanks for the photo. It seems that the USB device isn't accessible (-110 = ETIMEDOUT).

But a trace of the real kernel panic would be still necessary.

@nightBulb
Copy link
Author

@lategoodbye This is the kernel panic stack trace, (hopefully as needed)

FJIMG_20190722_011210

Also I discovered that linux-raspberrypi-4.19.57-2 also causes, what appears to be the same problem.
(Kernel panic and all ... )

linux-raspberrypi-4.19.57-1 seems to be the last sane kernel.

And so, changing the title seems appropriate.

@nightBulb nightBulb changed the title linux-raspberrypi-4.19.58-1 causes kernel panic on raspberry pi 2 (arch linux arm) linux-raspberrypi-4.19.58-1 and 4.19.57-2 causes kernel panic on raspberry pi 2 (arch linux arm) Jul 21, 2019
@lategoodbye
Copy link
Contributor

lategoodbye commented Jul 21, 2019

@nightBulb Sorry for being inprecise, we need the beginning of the kernel panic.

Apart from that, your issue looks like #3067

@nightBulb
Copy link
Author

nightBulb commented Jul 21, 2019

@lategoodbye Is there any place where the Bootlog is stored,
cause the beginning part of kernel panic scrolls up quickly on screen ...

or any other method besides snapping pictures to get the boot-log?

Also, linux-raspberrypi-4.19.57-1 is booting fine unlike #3067 .

@lategoodbye
Copy link
Contributor

Without rootfs there is chance to store the file.

The best solution is to connect serial TTL adapter to the debug UART.

In case USB still working you could try to enlarge the kernel log buffer by adding log_buf_len=1M to cmdline.txt

@nightBulb
Copy link
Author

nightBulb commented Jul 21, 2019

@lategoodbye I meant, how do you scroll screen after kernel panic?

I do not have a TTL adapter either.

One thing to note though, Before boot, If I remove the USB rootfs, the bootloader (or whatever it is that boots rpi) drops me to ash terminal.

after I re-insert rootfs USB and call init, it panics, and I can no longer run any commands even from bootloader ash 😁

The above panic is with 4.19.58-1

@lategoodbye
Copy link
Contributor

Sometimes it's possible to scroll back via page up/down. But this depends on the crash.

@nightBulb
Copy link
Author

nightBulb commented Jul 22, 2019

The (shift +) Page up/down keys do not seem to be working especially after kernel panic,

FJIMG_20190722_055358

I even tried the following,

  • Remove USB Flash Drive Containing root
  • reboot
  • boot process drops to ash
  • Re-insert USB Flash Drive containing root
  • run ./init script in rootfs terminal (ash)
  • wait for reset high speed USB device using dwc_otg error/kernel-message to occur
  • exit (by typing exit on screen)
  • kernel panic

No method to scroll the screen.

@pelwell
Copy link
Contributor

pelwell commented Jul 22, 2019

Adding framebuffer_height=2160 to config.txt will get twice as many lines on screen.

@dave-p
Copy link

dave-p commented Jul 22, 2019

Here is a console log showing the issue. It is from a Pi2 v1.1 running Arch Linux, with the /boot directory on SD and the OS on an SSD connected via USB - the Toshiba HDD mentioned in the log is just the usb-sata interface.

I've had a similar issue with another Pi2 v1.2 which is used as a TVHeadend server; the OS is on SD but the videos are on a USB HDD. With this device there is no crash but playback of video stops after a while and the disk is inaccessible until the Pi is rebooted.

The latest software I know is working is Linux 4.19.55 and bootloader/firmware 20190624.

minicom.txt

@nightBulb
Copy link
Author

nightBulb commented Jul 22, 2019

I too managed to get the picture of kernel panic using @pelwell 's idea.

I also have more pictures of the same kernel panic, if this one is not clear, I can upload them if needed ..

FJIMG_20190722_192617

The above image appears straight when clicked, it is rotated in preview.

@tsaarni
Copy link
Contributor

tsaarni commented Jul 22, 2019

I'm seeing the same bug on Raspberry Pi 3 Model B, booting from external USB SSD.
kernel-4.19.58-1-ARCH-boot-from-usb-device-fails.txt

@pelwell
Copy link
Contributor

pelwell commented Jul 22, 2019

The common factors here appear to be kernel version and Arch Linux - it's not a problem we've seen on Raspbian, and it would be enlightening if one of you would install Raspbian on a spare card to confirm (or not) that it boots OK.

@dave-p
Copy link

dave-p commented Jul 22, 2019

OK I installed the current Raspbian Buster Lite onto an RPi3 (kernel 4.19.57-V7+), then configured it to boot from SD but use an SSD as the root FS. I then ran a dist-upgrade which updated the kernel to 4.19.58. The system booted as normal. I moved the SD card and SSD to the same RPi2 as in my previous test and it continued to boot correctly. It's not quite the same test (the SSD and usb-sata interface were different) but it does suggest an Arch problem.

Arch forums have one post with the same issue:
https://archlinuxarm.org/forum/viewtopic.php?f=60&t=13792

minicom2.txt

@tsaarni
Copy link
Contributor

tsaarni commented Jul 22, 2019

I have been iterating over the kernel configuration parameters that Arch kernel has changed recently. See diff here. I'm not 100% sure I'm on a right track, but it seems to me that CONFIG_VMSPLIT_3G=y (instead of VMSPLIT_2G) is the change that triggers the bug for me on RPI3.

Can experts here deny or confirm that this could cause USB boot to not work?

Notifying also @kmihelich from Arch Linux ARM team.

@dave-p
Copy link

dave-p commented Jul 23, 2019

I've built a new stock Foundation kernel (which is now at version 4.19.59) and it boots my Arch RPi3 with rootfs on SSD. This does seem to be an Arch kernel config issue.

@pelwell
Copy link
Contributor

pelwell commented Jul 23, 2019

The Pi2 bringup was a long time ago now, but from what I recall we switched to VMSPLIT_2G to avoid the need for HIGHMEM on a 1GB part. Does the Arch config have HIGHMEM enabled?

@dave-p
Copy link

dave-p commented Jul 23, 2019

From https://archlinuxarm.org/packages/armv7h/linux-raspberrypi/files/config.v7 (also config.v6)
CONFIG_HIGHMEM=y

@tsaarni
Copy link
Contributor

tsaarni commented Jul 23, 2019

Changing to CONFIG_HIGHMEM=n seems to fix the problem too.

@lategoodbye
Copy link
Contributor

Changing to CONFIG_HIGHMEM=n seems to fix the problem too.

This is not a fix, it's only a workaround.

@tsaarni
Copy link
Contributor

tsaarni commented Jul 23, 2019

Yes I agree, sorry for bad wording.

@tsaarni
Copy link
Contributor

tsaarni commented Jul 23, 2019

This is not my area at all, but just in case this is relevant: Arch config enables also CONFIG_ARM_LPAE and CONFIG_ARCH_DMA_ADDR_T_64BIT which results in typedef u64 dma_addr_t. Interestingly, during compilation I see some warnings from dwc_otg driver e.g.

drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_increment_dma_buf’:
drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:243:30: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
  struct fiq_dma_blob *blob = (struct fiq_dma_blob *) st->dma_base;
                              ^
drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:252:14: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
  hcdma.d32 = (dma_addr_t) &blob->channel[n].index[i].buf[0];
              ^
drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_iso_out_advance’:
drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:292:30: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
  struct fiq_dma_blob *blob = (struct fiq_dma_blob *) st->dma_base;
                              ^
drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:304:14: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
  hcdma.d32 = (dma_addr_t) blob->channel[n].index[i].buf;
              ^

Rest of the warnings are here

@pelwell
Copy link
Contributor

pelwell commented Aug 11, 2019

Can you not use one of our kernels with the Arch userland?

@zertyz
Copy link

zertyz commented Aug 11, 2019

The solution is the same it has always been - her the Arch devs to unbreak their Pi 2 configuration. Or run Raspbian.

The problem seems not to be related to archlinuxarm's Pi 2 configuration, since it also happens on Raspbian, as reported some posts ago and verified by me, on my Pi 2 -- Booting raspbian from MMC, but with USB drives attached is enough to reveal the problem -- even if the USB drives are not involved in the boot process.

Maybe the title for this issue should be reconsidered and the Archlinuxarm part removed?

@semeion
Copy link

semeion commented Aug 11, 2019

@zertyz, the issue with USB happens with raspbian too? Relevant info!
I am using archlinux-arm in a rpi3b, i don´t have issue with kernel panic, but my USB ports can´t access drives anymore, if i try i get lot of errors.

@tsaarni
Copy link
Contributor

tsaarni commented Aug 11, 2019

We have already troubleshooted this problem here. It is related to the USB driver and it is triggered by kernel config settings CONFIG_VMSPLIT_3G=y and CONFIG_HIGHMEM=y. The problem appeared when these settings were taken into use in Arch raspberry kernel by @kmihelich who we have not reached for comments (see diff here).

Nobody seems to be working with HIGHMEM support for the USB driver. Therefore, like suggested here before, the only feasible way forward seems to be either to revert the change in Arch raspberry kernel package, or by moving to different Linux distro.

@esiqveland
Copy link

I came here after seeing this with archlinuxarm and kernel 4.19.65.
Booting with gpu_mem=256 makes kernel able to boot.

I found these older issues that might be related with regards to CONFIG_HIGHMEM=y and CONFIG_VMSPLIT_3G=y:

#1641

And some even older:

#1394

@graysky2
Copy link

The solution is the same it has always been - her the Arch devs to unbreak their Pi 2 configuration. Or run Raspbian.

@pelwell - Check your email for our exchange around 12-July. My memory is that in order to get the RPi4 to boot, the changes to the config file that @tsaarni linked were needed.

From my notes, they were these 3:

  1. "System type>Multiple platform selection" to disable "ARMv6 based platforms (ARM11)"
  2. Enable LPAE
  3. Select a 1G/3G split

I cannot remember if the HIGHMEM option is selected in doing this or not.

@pelwell
Copy link
Contributor

pelwell commented Aug 13, 2019

I'm not sure what you are saying. The issue is about why current Arch images don't run on Pi 2, and you've just described what we had to do on Pi 4. Note that the Pi 4 uses a completely different USB controller, so the problems reported here with the dwc_otg driver and HIGHMEM don't apply.

@graysky2
Copy link

@pelwell - I thought the hypothesis is that the problem for RPi2 was introduced when Arch ARM merged those support options including HIGHMEM to allow for RPi4 support.

We have already troubleshooted this problem here. It is related to the USB driver and it is triggered by kernel config settings CONFIG_VMSPLIT_3G=y and CONFIG_HIGHMEM=y. The problem appeared when these settings were taken into use in Arch raspberry kernel by @kmihelich who we have not reached for comments (see diff here).

Nobody seems to be working with HIGHMEM support for the USB driver. Therefore, like suggested here before, the only feasible way forward seems to be either to revert the change in Arch raspberry kernel package, or by moving to different Linux distro.

@pelwell
Copy link
Contributor

pelwell commented Aug 13, 2019

I thought the hypothesis is that the problem for RPi2 was introduced when Arch ARM merged those support options including HIGHMEM to allow for RPi4 support.

Yes, that is the hypothesis. I still don't get your point - you just seem to be restating the problem.

@graysky2
Copy link

graysky2 commented Aug 13, 2019

Yes, that is the hypothesis. I still don't get your point - you just seem to be restating the problem.

If we disable HIGHMEM will there by any ill-effects for Pi4 boards? If not, would it be it be a good test to have someone with Arch ARM and an affected RPi2 compile the current kernel using the following patch which disables HIGHMEM and see if it fixes the USB issue?

https://gist.github.com/graysky2/5dee027e153fadc98659a85f32aeddbc

If it does, and if the Pi4 can run without ill-effects, the change could be proposed.

@pelwell
Copy link
Contributor

pelwell commented Aug 13, 2019

If we disable HIGHMEM will there by any ill-effects for Pi4 boards?

Yes - the kernel won't be able to address all memory, and it may even crash - I don't remember the precise failure mechanism.

@graysky2
Copy link

@pelwell - OK. I still feel like having an affected RPi2 user try compiling/booting with that config patched disablign HIGHMEM will be a test that it is causing the problem. If it does, Arch ARM might need to consider another kernel package for RPi2 with HIGHMEM disabled, no?

@tsaarni
Copy link
Contributor

tsaarni commented Aug 13, 2019

@graysky2 I tested with RPi3 previously around here and it fixed the problem with dwc_otg driver.

@Flameborn
Copy link

Flameborn commented Aug 13, 2019 via email

@graysky2
Copy link

I'm sorry, I have not read through all 50+ comments. My goal is to help out by verifying what the correct fix is and potentially submit a PR to Arch ARM to fix it. It seems disabling HIGHMEM is not a fix per some of the comments above. Do I understand that correctly?

@pelwell
Copy link
Contributor

pelwell commented Aug 13, 2019

Set the VMSPLIT back to 2G, then disable HIGHMEM and LPAE because neither is needed.

@Flameborn
Copy link

Flameborn commented Aug 13, 2019 via email

@graysky2
Copy link

@pelwell -

  • "System type>Multiple platform selection" to disable "ARMv6 based platforms (ARM11)"
  • Enable LPAE
  • Select a 1G/3G split

I am confused about this... in the email exchange you mentioned LPAE was required to boot the RPi4 as I recall. Are you saying with 2 separate kernel packages?

  1. Remain as-is for RPi4
  2. VMSPLIT back to 2 G and disable HIGHMEM for RPi2

@pelwell
Copy link
Contributor

pelwell commented Aug 14, 2019

Yes - that's what we use.

@denisandroid
Copy link

denisandroid commented Aug 14, 2019

I observe a similar problem on RPI3 :( When will there be official corrections?

linux-raspberrypi-4.19.57-2

[   11.694847] usb 1-1.2: reset high-speed USB device number 5 using dwc_otg
[   11.964878] usb 1-1.2: reset high-speed USB device number 5 using dwc_otg
[   12.234876] usb 1-1.2: reset high-speed USB device number 5 using dwc_otg
[   12.504872] usb 1-1.2: reset high-speed USB device number 5 using dwc_otg
[   12.635410] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
[   12.640593] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 00 00 00 00 f0 00
[   12.645834] print_req_error: I/O error, dev sda, sector 0
[   12.744877] usb 1-1.2: reset high-speed USB device number 5 using dwc_otg
[   13.014874] usb 1-1.2: reset high-speed USB device number 5 using dwc_otg
[   13.284848] usb 1-1.2: reset high-speed USB device number 5 using dwc_otg
[   13.554853] usb 1-1.2: reset high-speed USB device number 5 using dwc_otg
[   13.824870] usb 1-1.2: reset high-speed USB device number 5 using dwc_otg
[   14.094896] usb 1-1.2: reset high-speed USB device number 5 using dwc_otg
[   14.225416] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
[   14.229493] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 00 f0 00 00 10 00
[   14.233534] print_req_error: I/O error, dev sda, sector 240

Archlinux

@pelwell
Copy link
Contributor

pelwell commented Aug 14, 2019

I imagine the Arch devs will have a few packaging issues to sort out in order to support an extra kernel build, so give them a while.

@GazzaC
Copy link

GazzaC commented Aug 17, 2019

Did you guys know that if you swap out the board to an RPI4 the same error randomly occurs at various points (shortly after boot, or later) in time with these kernel versions and the USB configuration for drives? Its minus the dwc_otg errors, the rest is the same. Resulting in kernel panic eventually.

@graysky2
Copy link

graysky2 commented Aug 17, 2019

@nightBulb @johnny-cash @dkadioglu @dave-p @tsaarni @esiqveland @Flameborn @denisandroid - Update your systems with an insync mirror and see if the problem is gone:

archlinuxarm/PKGBUILDs@274a92b
archlinuxarm/PKGBUILDs@dd11914

If you're running a Pi4, take action based on pacman's warning:

:: Processing package changes...
(1/4) upgrading linux-raspberrypi                                            [############################################] 100%
________________________________________________________________________________

WARNING: You must switch to a different kernel for the Raspberry Pi 4:
         pacman -S linux-raspberrypi4
________________________________________________________________________________

@esiqveland
Copy link

I can confirm that my pi3 is booting again with gpu_mem=32

using the updated kernel package linux-raspberrypi-4.19.66-1.

@nightBulb
Copy link
Author

nightBulb commented Aug 17, 2019

@graysky2
I too can confirm that linux-raspberrypi-4.19.66-1-armv7h fixes the kernel panic on R-Pi 2
👍 👍 👍

although, do note that:
After complete boot and login there was a single

[ 13.457124] usb 1-1.4.2: device descriptor read/64, error -110

error in dmesg output, though that seems to occur on 4.19.57-1 as well.


So this issue seems most likely resolved.

If it is fixed for others as well, I recommend, this issue be closed. 🥇

Thanks. 😊

@GazzaC
Copy link

GazzaC commented Aug 18, 2019

Works.. However, had issue with NFS Server. I was getting the tag errors 0x2003 whilst nfs-server starting up. I had to disable the nfs-server service, reboot, enable and start the service. Subsequent reboots seem fine. So NFS Server needs a refresh following this issue. On a RPI2.

@denisandroid
Copy link

denisandroid commented Aug 18, 2019

Updates to Archlinux (RPI 3)

Пакеты (11) dbus-1.12.16-2  git-2.22.1-1  linux-firmware-20190815.07b925b-1  linux-raspberrypi-4.19.66-1
linux-raspberrypi-headers-4.19.66-1  llvm-libs-8.0.1-2  raspberrypi-bootloader-20190816-1
raspberrypi-bootloader-x-20190816-1  systemd-242.84-2  systemd-libs-242.84-2
systemd-sysvcompat-242.84-2

To be downloaded: 144.67 MiB
To be installed: 686.54 MiB
Resizing: 1.66 MiB

New kernel: 4.19.66-1
Problem core: 4.19.65-1

Errors when working with usb are not observed. 'Dmesg' is empty.

dd bs=1M count=256 if=/dev/zero of=/tmp/sdb/test.img oflag=direct
256+0 записей получено
256+0 записей отправлено
268435456 байт (268 MB, 256 MiB) скопирован, 31,3382 s, 8,6 MB/s

@graysky2
Copy link

@pelwell - Probably safe to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests