CoreCLR fails to run when mlock is unavailable #10568

omajid · 2018-06-25T15:56:37Z

CoreCLR uses mlock during startup and fails if mlock fails with EPERM. Generally, that's not a problem.

However, many Linux distributions are starting to use systemd-nspawn for building code. This creates a chroot where programs have restricted capabilities. Specifically they do not have CAP_IPC_LOCK, which means they can't use mlock.

Wwhen mlock doesn't work, coreclr fails to start. This shows up in an strace as something like:

mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbd542bb000
mlock(0x7fbd542bb000, 4096)       = -1 EPERM (Operation not permitted)
write(2, "Failed to initialize CoreCLR, HR"..., 49) = 49

As a result, this makes it basically impossible to build coreclr in some Linux distribution build systems.

The text was updated successfully, but these errors were encountered:

omajid · 2018-06-25T16:03:05Z

cc @tmds @alucryd

omajid · 2018-06-25T16:03:46Z

See dotnet/source-build#285 (comment) and rpm-software-management/mock#186 for some examples where this is hitting some builds

janvorli · 2018-06-26T18:43:00Z

The mlock is necessary for proper behavior of the FlushProcessWriteBuffers PAL function that is crucial for ensuring reliable runtime suspension for GC. See https://github.com/dotnet/coreclr/blob/e6ebea25bea93eb4ec07cbd5003545c4805886a8/src/pal/src/thread/process.cpp#L3095-L3098 for description of the reason.
On Linux 4.3 and higher, there is a sys_membarrier syscall that we could use as an alternate mechanism to implement FlushProcessWriteBuffers. Issue #4501 is tracking that. @sdmaclea tried to implement it and tested it on ARM64 . He has found that the performance was really bad and that running time of our ~11000 coreclr tests was about 50% longer. However, no testing was done on other hardware, so it was not clear if the performance issue is ARM64 specific or an overall problem.
Interestingly enough, I've just discovered the following article describing performance issues with the sys_membarrier: https://lttng.org/blog/2018/01/15/membarrier-system-call-performance-and-userspace-rcu/. The reason is that the syscall internally waits until all running threads on the system have gone through a context switch, which could take tens of milliseconds. But the good news mentioned in this article is that starting with Linux 4.14, there is a new flag that can be passed to the sys_membarrier syscall and that makes it to use IPI to implement the memory barrier semantics. And that is much faster. So we should give it a try.

jkotas closed this as completed in dotnet/coreclr#20949 Nov 20, 2018

msftgits transferred this issue from dotnet/coreclr Jan 31, 2020

dotnet locked as resolved and limited conversation to collaborators Dec 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CoreCLR fails to run when mlock is unavailable #10568

CoreCLR fails to run when mlock is unavailable #10568

omajid commented Jun 25, 2018

omajid commented Jun 25, 2018

omajid commented Jun 25, 2018

janvorli commented Jun 26, 2018

CoreCLR fails to run when mlock is unavailable #10568

CoreCLR fails to run when mlock is unavailable #10568

Comments

omajid commented Jun 25, 2018

omajid commented Jun 25, 2018

omajid commented Jun 25, 2018

janvorli commented Jun 26, 2018