Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service Worker related crashes #5634

Open
The-Compiler opened this issue Jul 30, 2020 · 15 comments
Open

Service Worker related crashes #5634

The-Compiler opened this issue Jul 30, 2020 · 15 comments
Labels
bug: segfault/crash/hang There's a low-level crash in C++, or a hang/freeze. component: QtWebEngine Issues related to the QtWebEngine backend, based on Chromium. priority: 0 - high Issues which are currently the primary focus.
Milestone

Comments

@The-Compiler
Copy link
Member

The-Compiler commented Jul 30, 2020

Summary if you've been linked to this issue due to crashes

Due to a yet-unknown issue (possibly unclean shutdowns of qutebrowser?), the "Service Worker" storage used by the underlying QtWebEngine/Chromium gets corrupted. When qutebrowser is used with such a corrupt Service Worker storage, crashes will occur on pages which use the corrupted data, such as GMail. We've unfortunately never been able to figure out what causes the corruption (happening in the qutebrowser run before the crashes start!), help tracking this down would be much appreciated.

As a one-time fix, remove the webengine/Service Workers directory in qutebrowser's data directory (e.g. ~/.local/share/qutebrowser/webengine/Service Worker on Linux, see `:version for the exact location of the data directory). It's unclear whether websites are storing any data supposed to be persistent in there, but no such case is known so far, so it should be safe to delete.

As a permanent workaround, run :set qt.workarounds.remove_service_workers true. This make qutebrowser automatically remove the data (corrupted or not) on every start. Be aware that this will negatively impact the startup time (especially on a HDD) and could in theory delete data supposed to be persistent (see above). Please also consider first only removing the directory manually, in order to help track down what exactly causes the problem to appear again.


Original issue

Apparently QtWebEngine 5.15 still has some kind of issue with service workers. This is mostly a continuation of #4853 and #5279, i.e. a bug (or several) which has been following us in one way or another for over a year now 😢

The reproducer from #5279 with the Epic Games Store doesn't seem to reproduce the issue this time around, and the --disable-shared-workers workaround which is active for Qt 5.14 doesn't seem to help, according to a Reddit post.

Analyzing the crash with DebugDiag yields an unusable stacktrace (full log):

DetailID = 3
        Count:    2
        Exception #:  0XC0000005
        Stack:        
                Qt5WebEngineCore!GetHandleVerifier+0x1cf7e36
                Qt5WebEngineCore!GetHandleVerifier+0x1c6cd66
                Qt5WebEngineCore!QWebEngineUrlScheme::QWebEngineUrlScheme+0x311f13
                Qt5WebEngineCore!GetHandleVerifier+0x1d0f95a
                Qt5WebEngineCore!QtWebEngineCore::JavaScriptDialogController::qt_static_metacall+0x29ab3
                Qt5WebEngineCore!QWebEngineUrlSchemeHandler::qt_metacast+0x17d65
                Qt5WebEngineCore!GetHandleVerifier+0x92bf
                Qt5WebEngineCore!GetHandleVerifier+0x8dd2
                Qt5WebEngineCore!QWebEngineUrlSchemeHandler::qt_metacast+0xc964
                Qt5WebEngineCore!QWebEngineUrlSchemeHandler::qt_metacast+0xd1d2
                Qt5WebEngineCore!GetHandleVerifier+0x9762
                Qt5WebEngineCore!QWebEngineUrlSchemeHandler::qt_metacast+0x40059
                Qt5WebEngineCore!GetHandleVerifier+0x192aadd
                Qt5WebEngineCore!QWebEngineUrlSchemeHandler::qt_metacast+0x434d5
                Qt5WebEngineCore!QWebEngineUrlSchemeHandler::qt_metacast+0xe1d8
                KERNEL32!BaseThreadInitThunk+0x14
                ntdll!RtlUserThreadStart+0x21

However, kiburtse of the Qt Company did some crazy work at hand-decoding it:

base\threading\platform_thread_win.cc @ 108                                     base::`anonymous namespace'::ThreadFunc
base\threading\thread.cc @ 379                                                  base::Thread::ThreadMain
content\browser\browser_process_sub_thread.cc @ 135                             content::BrowserProcessSubThread::IOThreadRun
base\run_loop.cc @ 158                                                          base::RunLoop::Run
base\task\sequence_manager\thread_controller_with_message_pump_impl.cc @ 471    base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::Run
base\message_loop\message_pump_win.cc @ 76                                      base::MessagePumpWin::Run
base\message_loop\message_pump_win.cc @ 623                                     base::MessagePumpForIO::DoRunLoop
base\task\sequence_manager\thread_controller_with_message_pump_impl.cc @ 221    base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoSomeWork
base\task\sequence_manager\thread_controller_with_message_pump_impl.cc @ 366    base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWorkImpl
base\task\common\task_annotator.cc @ 142                                        base::TaskAnnotator::RunTask
base\bind_internal.h @ 644                                                      base::internal::Invoker<...>::RunOnce
content\browser\cache_storage\cache_storage_operation.cc @ 50                   content::CacheStorageOperation::Run
base\bind_internal.h @ 621                                                      base::internal::InvokeHelper<1,void>::MakeItSo<...> >
content\browser\cache_storage\legacy\legacy_cache_storage.cc @ 1424             content::LegacyCacheStorage::SizeImpl
content\browser\cache_storage\legacy\legacy_cache_storage_cache.cc @ 856        content::LegacyCacheStorageCache::Size

At this point I'm not 100% sure it's Windows specific (I think I've heard of people on Linux who have crashes which are solved by moving the Service Worker directory away), but it definitely seems to mostly happen on Windows.

To clarify: This isn't a bug in qutebrowser, I'm just opening this to have a place to track anything new we find out. Right now it'd still really help to have a reliable reproducer.

@The-Compiler The-Compiler added bug: segfault/crash/hang There's a low-level crash in C++, or a hang/freeze. component: QtWebEngine Issues related to the QtWebEngine backend, based on Chromium. os: Windows Issues which only happen on Windows. priority: 1 - middle Issues which should be done at some point, but aren't that important. labels Jul 30, 2020
@nancym
Copy link

nancym commented Jul 30, 2020

I seem to have solved my qutebrowser crashes by putting this (and only this) line in my config.py:

config.source('C:\\Users\\username\\Sync\\​qb\\config2.py')

I wrote about this in https://www.ii.com/qutebrowser-tips-fragments/#_compartmentalizing_config_py. Here's an excerpt:

Starting with qutebrowser v1.9.0, I experienced a lot of crashes in qutebrowser on Windows. After I moved all my config.py lines (other than the above config.source line) to my Sync directory, these crashes stopped.

Maybe this data point will help figure out what's causing these crashes.

@The-Compiler
Copy link
Member Author

@nancym Hm, I don't see how that would make any difference. Did you by any chance also clean your qutebrowser data directory (or the service worker directory contained in it?) when doing that? Could you maybe reverse that change and see if you see those crashes again? I'd be quite surprised!

@nancym
Copy link

nancym commented Jul 30, 2020

Yes, I quit using my qutebrowser default config and data directories and now use only my above-mentioned Sync directory. I spent literally tens (maybe hundreds!) of hours dealing with this, including downloading debugging tools from Microsoft. I agree that this is a weird solution, but it worked for me. Maybe ask others, e.g. the person in that Reddit thread, to try this. If I were a member of Reddit, I would post there…

My data directory used to contain only my userscripts. Now I call all my userscripts with a full (not relative) path to my Sync directory - maybe this is the secret?!

@The-Compiler
Copy link
Member Author

Then the thing fixing this issue was deleting the data directory, not the config change. The problem is that something in there (the service worker cache) gets corrupted and causes those crashes (as soon as a website uses a service worker) until the corrupted data gets removed. Then it works again until it mysteriously gets corrupted again.

@nancym
Copy link

nancym commented Jul 30, 2020

Aha, so a workaround that may work for others is to do whatever is needed to put stuff that normally goes in the data directory elsewhere. For me, that was to specify all my userscripts with a full (not relatative) path pointing at my Sync directory.

I just looked at my qutebrowser default data directory and it has a lot in it so maybe only userscripts/ needs to be moved. I also still think that moving my config.py (and all my backup config.py) files had something to do with my solution.

@The-Compiler
Copy link
Member Author

No, the problem is in the webengine/Service Worker folder in there (which can't be put elsewhere). I'm guessing you just fixed the issue by deleting the corrupted files there. I'm pretty sure if whatever caused the corruption to occur originally would happen again, you'd have those crashes again.

The problem (and what's making this so hard to track down) is that the corruption seems to happen very rarely. Once it has happened, reproducing the crashes is easy (as they are somewhat common), but the crucial question is why that corruption is happening.

@The-Compiler
Copy link
Member Author

Hm, I also wonder how much this is related to a second corruption nobody has been able to explain so far, #5606 (and the issues linked there). @nancym, do you happen to remember if you ever saw an "Errors occurred while reading state" message (screenshot) before this started happening?

@The-Compiler
Copy link
Member Author

Looks like this still sometimes happen (though rarely) on Linux: https://termbin.com/f9l1

@The-Compiler
Copy link
Member Author

With 7d8fd50 I've now introduced a new qt.workarounds.remove_service_workers setting which can be set to true to nuke the service workers directory on every start.

I'm not really happy with that workaround (and I'd still hope someone gets a reliable reproducer for this some day, so that I can report this properly to Qt upstream) - but I guess for affected people it's a better escape hatch than having to write a wrapper script around qutebrowser.

@The-Compiler The-Compiler changed the title Service Worker related crashes on Windows Service Worker related crashes Mar 2, 2021
@nixargh
Copy link

nixargh commented Mar 17, 2021

Moving 'Service Workers' directory fixed my issue, thank you.

Off-topic: Only today I've realized what a great job you're doing when my qutebrowser died and I had to use others... what a pain in the neck it was.

@The-Compiler The-Compiler removed the os: Windows Issues which only happen on Windows. label Apr 19, 2021
@The-Compiler
Copy link
Member Author

In this Reddit thread, u/Cognhuepan says they're on Fedora 34 and:

Okay, I've just installed upgrades, and the system forced me to restart to install these updates closing qutebrowser and producing a new corrupt basedir.

@The-Compiler
Copy link
Member Author

Another possible lead from this reddit thread:

The websites I had open while testing it included Twitter, Reddit, Youtube, and Google.

and then shutting down with shutdown now on Archlinux.

@The-Compiler The-Compiler added priority: 0 - high Issues which are currently the primary focus. and removed priority: 1 - middle Issues which should be done at some point, but aren't that important. labels Jun 4, 2021
@edi9999
Copy link
Contributor

edi9999 commented Feb 7, 2022

I have the same issue, about 2 or three times per day, and usually have a gmail or twitter tab when the issue happens.

I'm on linux (ubuntu 20.04), so it also happens there.

I'm trying the fix :set qt.workarounds.remove_service_workers true, if I don't get back any time soon it means that it does fix the issue.

@The-Compiler
Copy link
Member Author

So it looks like :set qt.workarounds.remove_service_workers true is an appropriate workaround for this for many people - yet most don't know about it obviously, and I continue to get many unusable qutebrowser crash reports with segfaults every day.

We should probably look into changing the crash dialog for qutebrowser so that it proposes this as a solution for segfaults rather than offering to submit a crash report - almost nobody seems to add information to those anyways...

@The-Compiler The-Compiler added this to the v3.0.0 milestone Apr 3, 2022
@leo848
Copy link

leo848 commented Jun 30, 2022

Had the same issues and moving that directory fixed it. I probably sent 50+ bug reports, these can likely be reduced to this single issue.

We should probably look into changing the crash dialog for qutebrowser so that it proposes this as a solution for segfaults rather than offering to submit a crash report - almost nobody seems to add information to those anyways...

I agree, that would probably help users who don't try to investigate into this issue a lot further. Because qutebrowser with --temp-basedir did fix the issue, I thought the problem was with my config for a longer time until I searched for more issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug: segfault/crash/hang There's a low-level crash in C++, or a hang/freeze. component: QtWebEngine Issues related to the QtWebEngine backend, based on Chromium. priority: 0 - high Issues which are currently the primary focus.
Projects
None yet
Development

No branches or pull requests

6 participants