Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Freezing and crashes in 4th tutorial #6442

Open
bunnybot opened this issue Apr 30, 2024 · 9 comments
Open

Freezing and crashes in 4th tutorial #6442

bunnybot opened this issue Apr 30, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@bunnybot
Copy link

palinoMirrored from Codeberg
Created on Tue Apr 30 14:30:57 CEST 2024 by Pavol Gono (palino)


Describe the bug
I am experiencing game freezing and crashes. I've noticed it shortly before 1.2 release for the first time, when translating 4th economic tutorial.
On my PC, the combination of mingw + 4th tutorial seems to produce frequent problems of random nature.
Most frequent are game freezings (deadlocks?), my rough estimation is 1/4 probability that playing 4th tutorial freezes.
I've experienced two crashes - game silently finished without suspicious stdout line. Luckily, second time I run widelands under gdb and I noticed strange exit code 3. I couldn't find such return value in sources.
And finally, I've seen "Error in Lua Coroutine" once.

All attached logs are generated when playing 4th tutorial, from the same PC.

To reproduce
Play 4th tutorial and sometimes freezing/crash happens.

Expected behavior
Stable gameplay without freezing and crashes.

Crash log

  • 2 freezing examples, one Report.wer
  • 2 crash examples, one with related gdb output, one with Report.wer
  • "Error in Lua Coroutine" example

Version:

  • OS: "Windows 10 64-bit", also reproduced on "Windows 7 64-bit" PC once
  • Widelands Version: between 1.3git26591 (46b2005@master) and 1.3git26620 (9f2be64@master); both Debug and Release builds
  • Enabled Add-Ons: no

Additional context
I have feeling that clicking with mouse outside of modal dialogs triggers these problems more frequently, but not sure.
I have also frequent game saving turned on (each minute or two), maybe it has infuence.
May be related to https://codeberg.org/wl/widelands/issues/4750 - the same environment.

Question
For the crash cases - it seems running gdb in Debug mode is useless.
Do you think "Structured Exception Handling" could catch the crashes?
https://learn.microsoft.com/en-us/cpp/cpp/structured-exception-handling-c-cpp?view=msvc-170
Or just removing "#ifdef NDEBUG" at the end of main.cc could improve it?

@bunnybot bunnybot added the bug Something isn't working label Apr 30, 2024
@bunnybot
Copy link
Author

hessenfarmerMirrored from Codeberg
On Tue Apr 30 15:08:50 CEST 2024, Stephan Lutz (hessenfarmer) wrote:


Well I believe the linked bug #6388 is unrelated as it is a clear lua error. I will try to fix that in the other thread.
if you use gdb and the game crashes you need to run "bt" to get the backtrace providing the crashed task. so if you encounter this again this would be very appreciated information.

@bunnybot
Copy link
Author

palinoMirrored from Codeberg
On Tue Apr 30 15:25:56 CEST 2024, Pavol Gono (palino) wrote:


Well I believe the linked bug #6388 is unrelated as it is a clear lua error. I will try to fix that in the other thread.
if you use gdb and the game crashes you need to run "bt" to get the backtrace providing the crashed task. so if you encounter this again this would be very appreciated information.

Look at the stdout_crash_example_debug.gdb.txt. At the time the gdb prompt was available, the application was already finished (with exit value 3). No more backtrace available. In mingw the application crash behaves differently than in linux.

@bunnybot
Copy link
Author

frankystoneMirrored from Codeberg
On Tue Apr 30 18:37:37 CEST 2024, frankystone wrote:


The verbose logs says e.g.:

[...]
[00:15:34.028 real] DEBUG: WARNING: Initializer thread locking mutex Objects, already waiting for 3020 ms

I've played the 4th tutorial with widelands version 1.2 on linux now without any issues.

@bunnybot
Copy link
Author

hessenfarmerMirrored from Codeberg
On Tue Apr 30 21:55:09 CEST 2024, Stephan Lutz (hessenfarmer) wrote:


ok i can reliably reproduce the hangs. they happen reliably after having opened the ware statistics menu.
After adding a sleep(100) line after this line
https://codeberg.org/wl/widelands/src/commit/2fe75caf928c565bd42d6c258f9dfaedab3f7192/data/campaigns/tutorial04_economy.wmf/scripting/mission_thread.lua#L123
the game does not hang anymore. as we have multiple such lines my guess is that we are trying to assign a new objective (line after the linked line) before the old objective is done.
However I do not know whether this is reasonable.

@bunnybot
Copy link
Author

bunnybot commented May 1, 2024

hessenfarmerMirrored from Codeberg
On Wed May 01 15:38:51 CEST 2024, Stephan Lutz (hessenfarmer) wrote:


<@>Nordfriese
is my assumption somewhat reasonable? if yes I'd just add the sleep time in the subfunction set_objective_done.
Perhaps I'll just prepare a PR and see whether it works for <@>palino

@bunnybot bunnybot changed the title Freezing and crashes in msys2 mingw64 environment Freezing and crashes in 4th tutorial May 1, 2024
@bunnybot
Copy link
Author

bunnybot commented May 1, 2024

palinoMirrored from Codeberg
On Thu May 02 00:16:42 CEST 2024, Pavol Gono (palino) wrote:


I am renaming issue, from hessenfarmer's analysis it seems just mission thread script was faulty and mingw has nothing to do with the problems.
I've already found why my crash was silent and what exit value 3 means:
https://learn.microsoft.com/en-us/previous-versions/k089yyh0(v=vs.140)
https://sourceforge.net/p/mingw-w64/mailman/message/36195100/
So the proper gdb arguments are following, to get backtrace from such crash under mingw:
gdb ./widelands.exe -ex='set args --verbose' -ex='break abort' -ex=run

@bunnybot
Copy link
Author

bunnybot commented May 1, 2024

palinoMirrored from Codeberg
On Thu May 02 01:56:17 CEST 2024, Pavol Gono (palino) wrote:


I've analysed crash handlings in various scenarios + comparison between mingw and linux. The result: under mingw it is bad - without gdb there is practically no clue which code part crashed.
First step to fix it is implementation of Nordfriese's TODO in src/main.cc - https://stackoverflow.com/a/26398082 .
What I would like to change is to have configurable these two possibilities (e.g. via commandline)

  • widelands should generate crash reports + backtraces when possible (suitable especially for windows)
  • widelands should propagate signals & exceptions unaltered to retain original stack for core dumps or other mechanisms
    (e.g. main.cc's condition #ifdef NDEBUG to be changed)

I see the following comment in main.cc, which I don't fully understand:
We can't handle SIGABRT like this since we have to redirect that one elsewhere to suppress non-critical errors from Eris.
If Eris is generating abort signal, this is anyway terminal situation and application will stop, regardless of handler.
Changing of SIGABRT signal handler is only made inside MapScriptingPacket::read(), so SIG_DFL could be replaced by handler doing backtraces.
Or is there a more complex situation?

I may have time for doing PR next weeks, let's see.

@bunnybot
Copy link
Author

bunnybot commented May 2, 2024

NordfrieseMirrored from Codeberg
On Thu May 02 09:44:38 CEST 2024, Benedikt Straub (Nordfriese) wrote:


I see the following comment in main.cc, which I don't fully understand:
We can't handle SIGABRT like this since we have to redirect that one elsewhere to suppress non-critical errors from Eris.
If Eris is generating abort signal, this is anyway terminal situation and application will stop, regardless of handler.
Changing of SIGABRT signal handler is only made inside MapScriptingPacket::read(), so SIG_DFL could be replaced by handler doing backtraces.
Or is there a more complex situation?

SIGABRT is not necessarily a terminal situation. When Eris encounter a globals.dump that it can't deserialize, it crashes by calling abort(), but we set a custom signal handler that simply throws a WException, which cancels loading the problematic savegame and then allows you to keep using Widelands normally without an application crash.

But yes, this is the only place where we redirect SIGABRT, so replacing SIG_DLF with the custom crash handler from main.cc in MapScriptingPacket::read would be sufficient.

@bunnybot
Copy link
Author

palinoMirrored from Codeberg
On Fri May 10 14:34:02 CEST 2024, Pavol Gono (palino) wrote:


I've prepared autohotkey script which simplifies testing (under Windows in foreground - mouse & keyboard should not be used that time). If you find it useful, I can create PR to include it somewhere in git.
These game settings are necessary:

  • Language: English
  • Window Size: 1600 x 900
  • Start building road after placing a flag: true

Download v2.0 exe from https://www.autohotkey.com/download/, install it. After getting to main menu of widelands, you can doubleclik on the wct4.ahk script to execute automatic clicking.

For release build, it could reproduce issue only very seldom (it seems manual clicking is more effective). But for debug build, when I run widelands under mingw's gdb, the reproducibility is 100%.
I've tried also additional sleep 1 sec inside set_objective_done(), but it didn't help - still freezing. See attached patch and hang outputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant