Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ldc 1.35.0+ crashes #4523

Open
andrey-zherikov opened this issue Nov 5, 2023 · 14 comments
Open

ldc 1.35.0+ crashes #4523

andrey-zherikov opened this issue Nov 5, 2023 · 14 comments

Comments

@andrey-zherikov
Copy link

ldc started crashing on Windows after 1.35.0 release (nightly build is as well). Crash happens in different places and some times compilation is successful.

Getting the code to reproduce the issue:

git clone https://github.com/andrey-zherikov/argparse
cd argparse
git checkout 2.x

Running ldc:

ldc2.exe -c -d-debug -cov -g -unittest -w --oq -i -Isource source\argparse\package.d -vcolumns

Result (crash output is truncated):

>C:\d\ldc2-8399e47c-windows-multilib\bin\ldc2.exe -c -d-debug -cov -g -unittest -w --oq -i -Isource source\argparse\package.d -vcolumns

core.exception.AssertError@D:\a\ldc\ldc\dmd\dtemplate.d(6179): Assertion failure
----------------
0x00007FF7799017F3
0x00007FF7799017F3
0x00007FF7799051D0
0x00007FF7798B9BE0
0x00007FF775E68EF4
...

>C:\d\ldc2-8399e47c-windows-multilib\bin\ldc2.exe -c -d-debug -cov -g -unittest -w --oq -i -Isource source\argparse\package.d -vcolumns
C:\d\ldc2-8399e47c-windows-multilib\bin\..\import\core\sys\windows\basetsd.d-mixin-48(1420581280,435): Error: `goto case` not in `switch` statement
Exception Code: 0xC0000005
0x00007FF775E864BA, C:\d\ldc2-8399e47c-windows-multilib\bin\ldc2.exe(0x00007FF775E40000) + 0x464BA byte(s)
0x00007FF779AC9D75, C:\d\ldc2-8399e47c-windows-multilib\bin\ldc2.exe(0x00007FF775E40000) + 0x3C89D75 byte(s)
0x00007FF779AC3D35, C:\d\ldc2-8399e47c-windows-multilib\bin\ldc2.exe(0x00007FF775E40000) + 0x3C83D35 byte(s)
0x00007FF779ACF9FE, C:\d\ldc2-8399e47c-windows-multilib\bin\ldc2.exe(0x00007FF775E40000) + 0x3C8F9FE byte(s)
0x00007FF779ACE89E, C:\d\ldc2-8399e47c-windows-multilib\bin\ldc2.exe(0x00007FF775E40000) + 0x3C8E89E byte(s)
...

>C:\d\ldc2-8399e47c-windows-multilib\bin\ldc2.exe -c -d-debug -cov -g -unittest -w --oq -i -Isource source\argparse\package.d -vcolumns

core.exception.AssertError@D:\a\ldc\ldc\dmd\dtemplate.d(8068): Assertion failure
----------------
0x00007FF7799017F3
0x00007FF7799017F3
0x00007FF7799051D0
0x00007FF7798B9BE0
0x00007FF775E68EF4
0x00007FF779A2FB08
0x00007FF77991973C
0x00007FF77990DE74
...

>C:\d\ldc2-8399e47c-windows-multilib\bin\ldc2.exe -c -d-debug -cov -g -unittest -w --oq -i -Isource source\argparse\package.d -vcolumns

core.exception.AssertError@D:\a\ldc\ldc\dmd\dscope.d(203): Assertion failure
----------------
0x00007FF7799017F3
0x00007FF7799017F3
0x00007FF7799051D0
0x00007FF7798B9BE0
0x00007FF775E68EF4
...


@JohanEngelen
Copy link
Member

Thanks for the report.
Can you help us by reducing the testcase using dustmite? (https://github.com/CyberShadow/DustMite/wiki)
Much appreciated.

@kinke
Copy link
Member

kinke commented Nov 6, 2023

Hmm, I can't reproduce any problem after 10 serial attempts:

$ for i in {1..10}; do /c/temp/ldc2-8399e47c-windows-multilib/bin/ldc2 -c -d-debug -cov -g -unittest -w --oq -i -Isource source/argparse/package.d -vcolumns && echo success || echo failure; done
success
success
success
success
success
success
success
success
success
success

I've seen that the peak RAM is about 6.4 GB; maybe you're running too low? Although failing frontend assertions (and different ones!) because of insufficient memory don't seem to make a lot of sense.

@andrey-zherikov
Copy link
Author

It's not reproducible in 100% of cases even on my laptop but I was managed to start dustmite - hope it will get some results.
Note that this issue fails CI in my project.

@kinke
Copy link
Member

kinke commented Nov 6, 2023

Note that this issue fails CI in my project.

Note that the Windows GHA runners have 7 GB of RAM, so that's very close: https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources

@andrey-zherikov
Copy link
Author

Hmm, I can't reproduce any problem after 10 serial attempts:

Could you try cmd?

I've seen that the peak RAM is about 6.4 GB; maybe you're running too low? Although failing frontend assertions (and different ones!) because of insufficient memory don't seem to make a lot of sense.

I have enough RAM to hold 6.4GB :)
Also iirc "out of memory" crash looks differently - not as sporadic asserts.

@kinke
Copy link
Member

kinke commented Nov 6, 2023

Could you try cmd?

Nope, I don't see any point. I have used a cmd shell before the bash loop, for one or two successful runs. Also checked it here on Linux, no problems whatsoever.

@andrey-zherikov
Copy link
Author

Note that this issue fails CI in my project.

Note that the Windows GHA runners have 7 GB of RAM, so that's very close: https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources

Interesting... I'll take s closer look at it. On another side, it doesn't fail on Linux with same 7GB RAM - seems I'm lucky there 😅

@andrey-zherikov
Copy link
Author

Note that this issue fails CI in my project.

Note that the Windows GHA runners have 7 GB of RAM, so that's very close: https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources

Interesting... I'll take s closer look at it. On another side, it doesn't fail on Linux with same 7GB RAM - seems I'm lucky there 😅

DustMite is still running (it's at "depth 14" step) and I observed few things:

  • sometimes ldc2 process hangs: it allocates <100K of RAM and stops (i.e. doesn't exit)
  • for normal compilation it definitely uses 5.5-6GB of RAM

@kinke
Copy link
Member

kinke commented Nov 8, 2023

Have you tried another box? We at Symmetry are using a build which is very close to v1.35.0, and this implies that we of course don't see any such weird issues.

@andrey-zherikov
Copy link
Author

Have you tried another box?

I have the only my laptop and GitHub runners. The main problem is that ldc fails on GH and only on Windows. BTW 1.34.0 works fine there.

@kinke
Copy link
Member

kinke commented Nov 8, 2023

I'm pretty sure that the GHA issues on Windows are because of insufficient RAM. v1.35 came with slightly increased memory usage, about 5% at Symmetry IIRC. Have you seen failing assertions for the CI runners too, or just crashes with segfaults etc.?

@andrey-zherikov
Copy link
Author

Have you seen failing assertions for the CI runners too

My CI runners are GitHub runners.

Short update on the issue: dustmite is still running (I'll leave it running till Monday and then just drop that effort if it's not done)


One more observation: I started ldc2 (1.35.0) and dmd(2.105.3) under valgrind on Ubuntu WSL using the following code:

void main()
{
}

Result of valgrind ldc2 -c -g -w -i simple.d -vcolumns:

==354== HEAP SUMMARY:
==354==     in use at exit: 11,136,387 bytes in 11,685 blocks
==354==   total heap usage: 25,556 allocs, 13,871 frees, 13,496,583 bytes allocated
==354==
==354== LEAK SUMMARY:
==354==    definitely lost: 135,050 bytes in 2,061 blocks
==354==    indirectly lost: 29,664 bytes in 857 blocks
==354==      possibly lost: 32 bytes in 1 blocks
==354==    still reachable: 10,971,641 bytes in 8,766 blocks
==354==                       of which reachable via heuristic:
==354==                         multipleinheritance: 1,048,512 bytes in 1 blocks
==354==         suppressed: 0 bytes in 0 blocks
==354== Rerun with --leak-check=full to see details of leaked memory
==354==
==354== Use --track-origins=yes to see where uninitialised values come from
==354== For lists of detected and suppressed errors, rerun with: -s
==354== ERROR SUMMARY: 7553 errors from 737 contexts (suppressed: 0 from 0)

Result from valgrind dmd -c -g -w -i simple.d -vcolumns:

==351== HEAP SUMMARY:
==351==     in use at exit: 10,167,178 bytes in 7,434 blocks
==351==   total heap usage: 11,267 allocs, 3,833 frees, 10,876,450 bytes allocated
==351==
==351== LEAK SUMMARY:
==351==    definitely lost: 102,181 bytes in 1,499 blocks
==351==    indirectly lost: 24,608 bytes in 670 blocks
==351==      possibly lost: 1,135,782 bytes in 1,240 blocks
==351==    still reachable: 8,904,607 bytes in 4,025 blocks
==351==         suppressed: 0 bytes in 0 blocks
==351== Rerun with --leak-check=full to see details of leaked memory
==351==
==351== For lists of detected and suppressed errors, rerun with: -s
==351== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Is it expected that ldc has so much memory errors?
Please see full valgrind log for ldc: valgrind.log

@andrey-zherikov
Copy link
Author

Dustmite was able to reach out a conclusion.
Here is the reduced use case: https://gist.github.com/andrey-zherikov/87675a4d6f4db62368c7b4d71df474a5
Here is the output:

>C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe package.d
package.d(34): Error: CTFE internal error: literal `TextStyle`
Exception Code: 0xC000001D
0x00007FF72AF69B39, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x2FA9B39 byte(s)
0x00007FF72AF69D11, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x2FA9D11 byte(s)
0x00007FF72B1556F3, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x31956F3 byte(s)
0x00007FF72AF987F0, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x2FD87F0 byte(s)
0x00007FF72AF7EFBE, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x2FBEFBE byte(s)
0x00007FF72B19C955, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x31DC955 byte(s)
0x00007FF72B18094E, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x31C094E byte(s)
0x00007FF72B1BDCD9, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x31FDCD9 byte(s)
0x00007FF72AE3F052, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x2E7F052 byte(s)
0x00007FF72AF501DA, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x2F901DA byte(s)
0x00007FF72AF4FE2F, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x2F8FE2F byte(s)
0x00007FF72AF50135, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x2F90135 byte(s)
0x00007FF72AE384DD, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x2E784DD byte(s)
0x00007FF72B1EF4C4, C:\d\ldc2-1.35.0-windows-multilib\bin\ldc2.exe(0x00007FF727FC0000) + 0x322F4C4 byte(s)
0x00007FFF5CCF257D, C:\WINDOWS\System32\KERNEL32.DLL(0x00007FFF5CCE0000) + 0x1257D byte(s), BaseThreadInitThunk() + 0x1D byte(s)
0x00007FFF5E88AA58, C:\WINDOWS\SYSTEM32\ntdll.dll(0x00007FFF5E830000) + 0x5AA58 byte(s), RtlUserThreadStart() + 0x28 byte(s)

@kinke
Copy link
Member

kinke commented Nov 14, 2023

That's a frontend ICE, happening with DMD on Linux too - I've tested v2.100-2.105 (failing for all of these). So most likely not what you wanted, but nevertheless showing a general frontend problem. Please file upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants