Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: compatibility with openmp #93

Closed
kloetzl opened this issue Sep 2, 2017 · 19 comments
Closed

BUG: compatibility with openmp #93

kloetzl opened this issue Sep 2, 2017 · 19 comments

Comments

@kloetzl
Copy link

kloetzl commented Sep 2, 2017

The non-deterministic nature of multi-threading makes this one hard to debug, but I think, I have arrived at a rather reproducible test case seccomp_omp.c.gz. Compile the program with -O0 -ggdb -fopenmp.

If we execute it with just one thread and let it do the forbidden syscall, the program dies as expected.

$ ./seccomp_omp -t 1 -k 0      
86195973
[1]    10103 invalid system call (core dumped)  ./seccomp_omp -t 1 -k 0

However, given more threads it just ceases to function.

$ ./seccomp_omp -t 2 -k 0
86198868
86195973
86200317
^C

The program doesn't get killed, it just stops doing anything, at all. I am at a loss what happened to the SIGSYS signal, and why it doesn't get handled?

This might not be a libseccomp issue at all, but I am not proficient enough with seccomp, signals and the threading system in general to debug the root cause. Hope you can make some sense of it.

@pcmoore pcmoore changed the title compatibility with openmp BUG: compatibility with openmp Sep 5, 2017
@pcmoore pcmoore self-assigned this Sep 5, 2017
@pcmoore
Copy link
Member

pcmoore commented Sep 5, 2017

NOTE: I haven't looked at your test case yet, I'm just guessing based on your well-written description

Considering the multi-threaded nature of this, have you tried setting the SCMP_FLTATR_CTL_TSYNC filter attribute to true before loading the filter into the kernel?

@kloetzl
Copy link
Author

kloetzl commented Sep 6, 2017

I did not set SCMP_FLTATR_CTL_TSYNC because I add the seccomp filter, before splitting into threads. Thus, the kernel should apply the filter to all threads, anyway (and apparently it does do that). Adding SCMP_FLTATR_CTL_TSYNC to the testcase makes no difference (which is less than 100 lines with pedantic error handling, btw).

@pcmoore
Copy link
Member

pcmoore commented Sep 6, 2017

Okay, I just wanted to mention it; it sounds like you are doing things correctly (setting the filter before spawning new threads). I'll have to take a closer look, but I may not get a chance to do that very soon.

@kloetzl
Copy link
Author

kloetzl commented Sep 6, 2017

A confirmation that I am not doing things blatantly wrong is good enough for me. Take your time!

@drakenclimber
Copy link
Member

drakenclimber commented Feb 12, 2018

In the single-threaded example, ./seccomp_omp -t 1 -k 0, openmp recognizes that only a single thread is going to run, so openmp bypasses much of the synchronization that it would normally do in a multi-threaded for loop. I verified this by observing the syscalls that were processed in seccomp_run_filters() in the kernel. As one would expect, I saw a call to __NR_write() and a call to __NR_madvise() which prompted seccomp to instruct the kernel to kill the thread.

In the multi-threaded example, ./seccomp_omp -t 2 -k 0, openmp tries to parallelize the for loop. Thus much more of the openmp library is utilized. This is apparent when again watching syscalls through seccomp_run_filters(). __NR_mmap(), __NR_mprotect(), __NR_clone(), __NR_futex(), and more syscalls are called prior to the call to madvise(). Ultimately one of the two threads finally calls madvise() and seccomp properly kills that thread. But openmp has synchronization between the threads and before the second thread can call madvise() (and get killed), the second thread calls futex() as it is waiting on something from the dead thread. And this is why the program hangs. One thread has been killed and the second thread is waiting for data (from the dead thread) that will never arrive.

I have not worked much with openmp, but there seems to be some discussion [1, 2, 3, ...] on signals in parallel blocks. Ultimately it looks like a difficult problem that you would be best to avoid.

It seems like there are a couple potential easy solutions:

  1. Don't use SCMP_ACT_KILL as your default handler. Rather, switch to using an error code instead, e.g. SCMP_ACT_ERROR(5). I made this change in your example program, and verified that it terminated properly.

  2. If you don't need the power of openmp, another option may be to switch to using a more common solution like pthread_create() or fork()

@pcmoore
Copy link
Member

pcmoore commented Feb 13, 2018

@drakenclimber so it sounds like the problem is really just openmp making additional syscalls that may not normally be part of the filter? Or is that missing a larger point?

@kloetzl
Copy link
Author

kloetzl commented Feb 14, 2018

@drakenclimber Thank you very much for investing your time. Waiting for a dead thread explains the symptoms.

To give more context: I write scientific code, thus I like to keep things simple with OpenMP. However, I also want to protect my users from themselves. Thus if my program does anything out of the ordinary, just nuke it. I am guessing, what I ultimately want to achieve is to just kill the process (#96) with a somewhat reasonable error message.

What value should I use for errno in SCMP_ACT_ERROR to closely mimic SECCOMP_RET_KILL_PROCESS? Is there something that works fine across all syscalls?

@drakenclimber
Copy link
Member

drakenclimber commented Feb 14, 2018

@pcmoore - kinda. But I think the bigger issue is that openmp does not handle signals gracefully inside of its parallel construct. I would guess this is by design.

@kloetzl - No worries - you can totally stick with OpenMP. It looks like others in the OpenMP community are doing something like the following:

ctx = seccomp_init(SCMP_ACT_ERRNO(ENOTBLK));
...
bool kill_process = false;
int rc;
#pragma omp parallel for num_threads(THREADS)
  for (int i = 0; i < THREADS * 2; i++) {
    fprintf(stderr, "%u\n", func(i));
    if (i == K) {
      rc = madvise(NULL, 0, 0); 
      if (rc < 0)
        kill_process = true;
      }   
    // sleep(10);
  }

  if (kill_process)
    goto error;

  seccomp_release(ctx);
  return 0;
error:
  seccomp_release(ctx);
  return -1; 
}

As for what errno should SCMP_ACT_ERRNO() return, it's really up to you. SCMP_ACT_ERRNO() will work across all syscalls. And your program will be the one handling it and returning an error back to the user, so you can choose whatever errno works best for you. In fact, picking an obscure one - say ENOTBLK - could be a way that you know the error came from seccomp blocking the call.

tl;dr - Detect the problem in the parallel loop, but wait to handle it until after the loop completes. You may need to add extra defensive coding within the loop due to a failing syscall.

@drakenclimber
Copy link
Member

@kloetzl - I'll look into issue #96 and see how well that plays with OpenMP. That could be another solution as well. Thanks!

@kloetzl
Copy link
Author

kloetzl commented Feb 15, 2018

And your program will be the one handling it and returning an error back to the user, so you can choose whatever errno works best for you. In fact, picking an obscure one - say ENOTBLK - could be a way that you know the error came from seccomp blocking the call.

The thing is, madvise is called deep in the bowels of malloc. Thus, I am not the one handling the error, glibc is. So the question is, does glibc (and all of the other libraries doing syscalls) know that a syscall can return other errors than given in its man page and handle it appropriately?

@drakenclimber
Copy link
Member

The thing is, madvise is called deep in the bowels of malloc. Thus, I am not the one handling the error, glibc is. So the question is, does glibc (and all of the other libraries doing syscalls) know that a syscall can return other errors than given in its man page and handle it appropriately?

Ahhh... gotcha. I would have to look through the glibc code to be sure, but I would be willing to hazard the following guesses:

  • If madvise returns an error, glibc will definitely also return an error
  • It's definitely possible glibc could return a different error than seccomp returned, thus you should use caution if you are expecting a "custom" return code like ENOTBLK

Long story short, glibc should not suppress any error code, but it could return a different error code instead

@pcmoore
Copy link
Member

pcmoore commented Feb 16, 2018

I'm trying to keep up with you guys, but I fear my lack of background with OpenMP has me a bit behind ... based on the comments above it looks like KILL_PROCESS could be a workable solution here? If so, we can bump up the priority of that issue, although it is on my list of things to address before the v2.4 release.

@drakenclimber
Copy link
Member

@pcmoore - Yes, I believe KILL_PROCESS is a viable solution to this bug. I tested @kloetzl 's test program above using the KILL_PROCESS action and the hang no longer occurs.

I have it implemented but am currently hitting a couple snags in the python autotests. I should have a patch out next week.

@pcmoore
Copy link
Member

pcmoore commented Feb 16, 2018

@drakenclimber ooh, patches? I love patches :)

Thanks guys.

@kloetzl
Copy link
Author

kloetzl commented Feb 16, 2018

I can confirm that KILL_PROCESS solves the issue: I too hacked a local version of libseccomp on Arch and the test program fails as intended. However, providing solid tests and running them on different kernels might be the bigger challenge.

@pcmoore
Copy link
Member

pcmoore commented Feb 16, 2018

Thanks for the confirmation @kloetzl.

Yes, writing proper tests can be tedious, but it's important. Just as important as the code it test IMHO.

I'm looking forward to the PR from @drakenclimber, we can discuss things a bit more once that code is published.

@pcmoore
Copy link
Member

pcmoore commented Mar 19, 2020

I'm doing some COVID-19spring cleaning and I think we've resolved your issue, is that correct @kloetzl? I'm going to close this issue, but if I'm wrong and you are still seeing problems please let us know and we will reopen it!

@pcmoore pcmoore closed this as completed Mar 19, 2020
@kloetzl
Copy link
Author

kloetzl commented Mar 30, 2020

Thanks for reminding me, I tested a bit on my end.

This issue is mostly fixed, with a small snag. The program no longer hangs in limbo when an invalid syscall is encountered, yay! With SCMP_ACT_TRAP one can even produce the name of the bad syscall. However, OpenMP overwrites my signal handler, so it falls back to the default error message once multiple threads are spawned. From a perspective of user experience I don't like this behavior, but think it is as good as it gets.

@pcmoore
Copy link
Member

pcmoore commented Mar 30, 2020

Ah, yes, I'm not sure there is much we can about the signal handler getting overwritten in libseccomp, sorry about that.

Thanks for letting us know the rest of it worked!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants