Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error happens when elf app doesn’t have a rseq_cs struct #6802

Open
VulnDetector opened this issue May 9, 2024 · 7 comments
Open

Error happens when elf app doesn’t have a rseq_cs struct #6802

VulnDetector opened this issue May 9, 2024 · 7 comments

Comments

@VulnDetector
Copy link

Describe the bug
Running DynamoRIO to instrument any application on the fortigate(based on linux platform), a fault occurs.
./drrun -- test (no client at all) results in an error: Restartable sequence behavior is not supported: struct rseq is not in static thread-local storage.

Note that running fortigate’s own app also result a error, include scp etc.

To Reproduce
Steps to reproduce the behavior:

  1. Root the fortigate vm 7.2.4, and copy DynamoRIO linux release 10 to the vm. And copy any test app to the vm.
  2. Precise command line for running the application: ./drrun -- test
  3. Exact output or incorrect behavior: Restartable sequence behavior is not supported: struct rseq is not in static thread-local storage

Expected behavior
No error, correct instrumentation.

Screenshots or Pasted Text

Restartable sequence behavior is not supported: struct rseq is not in static thread-local storage.

Versions

Additional context
This is the same rseq issue as described in #5431.

The problem might be related to rseq. I found the test app have no rseq_cs struct for each rseq region, which is described in https://dynamorio.org/API_BT.html#sec_rseq as limitation 2. I tried to use -disable_rseq option, but it gets the same error.

I found in the #5431 discussing, abhinav92003 said “The final issue I'm running into is I think because the struct rseq is not in the static TLS anymore. (It causes an EINVAL in rseq tests and "struct rseq is not in static thread-local storage" in burst tests). This is as documented at https://sourceware.org/pipermail/libc-alpha/2021-November/133221.html. Also, https://lwn.net/Articles/883104/ says that the rseq registered by glibc is stored in the Thread Control Block maintained by glibc. I'm working on modifying rseq_locate_tls_offset for this.” Function rseq_locate_tls_offset returns 0, then drrun puts the error.

@edeiana
Copy link
Contributor

edeiana commented May 9, 2024

Hi @VulnDetector !

Can you post the full output that you get after running ./drrun -- test?
And the log when you add -loglevel 4 to it?

copy any test app to the vm

Is test GNU test or your own application binary that you called test? If the latter, can you post it? Does the application obey all the constraints in https://dynamorio.org/API_BT.html#sec_rseq (e.g., are there rseq data somewhere else that the application uses)?

About getting the same error with -disable_rseq:
[...] the -disable_rseq runtime option may be used to return ENOSYS, which can provide a workaround for applications which have fallback code for kernels where rseq is not supported.

@VulnDetector
Copy link
Author

Thanks for your reply. @edeiana

The full output running that I get after running ./drrun -- test is :

<Application /DynamoRIO-Linux-10.0.0/bin64/test (4001). Restartable sequence behavior is not supported: struct rseq is not in static thread-local storage.>

The full output running that I get when I add -loglevel 4 is:

<Application /DynamoRIO-Linux-10.0.0/bin64/test (4313). Option parsing error : unknown option -loglevel. Continuing>
<Application /DynamoRIO-Linux-10.0.0/bin64/test (4313). Restartable sequence behavior is not supported: struct rseq is not in static thread-local storage.>

Test is my own app binary,and as illustrated, test doesn’t obey all the constraints in https://dynamorio.org/API_BT.html#sec_rseq, it doesn’t have a rseq_cs struct. So, I follow the https://dynamorio.org/API_BT.html#sec_rseq to set the -disable_rseq option, but it still returns the output:

<Application /DynamoRIO-Linux-10.0.0/bin64/test (4001). Restartable sequence behavior is not supported: struct rseq is not in static thread-local storage.>

I think the problem occurs when I use DynamoRIO to run an app binary which doesn’t have rseq support in a rseq supported environment, the DynamoRIO exit itself.

@edeiana
Copy link
Contributor

edeiana commented May 10, 2024

I think the problem occurs when I use DynamoRIO to run an app binary which doesn’t have rseq support in a rseq supported environment, the DynamoRIO exit itself.

I can't test it on the "fortigate vm 7.2.4" and I am not able to reproduce it locally.
Does this happen for any binary in an rseq supported environment? Even a simple "hello world"? Or do you have a simple example that showcase the problem?

Also, the reason for that message is that rseq_locate_tls_offset() returns 0 as offset (

offset = rseq_locate_tls_offset();
).
That routine does quite a bit of logging, which you are not capturing, as you got:

<Application /DynamoRIO-Linux-10.0.0/bin64/test (4313). Option parsing error : unknown option -loglevel. Continuing>

My guess is that -loglevel 4 was not in the right place (or you don't have a DEBUG build of DynamoRIO), please try: drrun -disable_rseq -loglevel 4 -- your_test_binary.
Also note that logged information is not printed to stderr or stdout, it will be in a file whose directory will be printed in DynamoRIO's output, so you can find it there.

Is your goal to have your application work under DynamoRIO with -disable_rseq? Or are you looking for additional rseq support (hence the removal of some of the assumptions in https://dynamorio.org/API_BT.html#sec_rseq)?

@VulnDetector
Copy link
Author

Thanks a lot. @edeiana
My goal is to instrument the app binary, no matter if -disable_rseq option is set, and I don’t need my app to have rseq support. This error happens for any binary in an rseq supported environment, even a simple "hello world". So, I think it might be the reason:”

when I use DynamoRIO to run an app binary which doesn’t have rseq support in a rseq supported environment, the DynamoRIO exit itself.”

I used DynamoRIO 10.0.0 release to run the command: drrun -loglevel 4 -- your_test_binary, and I compile DynamoRIO sourcecode encounter many compile error.

By the way, there is another problem could you give me some guidance? I use drrun –attach pid -t drcov command to attach gedit, and it has no response, and doesn’t create a drcov file. So, I want to ask what is the usage of -attach option, and when I use this option, can I use -c option to load an so?

@derekbruening
Copy link
Contributor

You need to add -debug for -loglevel to have any effect if you are using a packaged build containing both release and debug binaries. If you haven't been running debug build at all, it enables a lot of checks so it will help find problems earlier; in general it should be the first thing you run on hitting a problem: see https://dynamorio.org/page_debugging.html

I think the problem occurs when I use DynamoRIO to run an app binary which doesn’t have rseq support in a rseq supported environment, the DynamoRIO exit itself.

Pure-assembly apps with zero rseq usage run fine under DR so I do not think this is the problem.
You say that "Function rseq_locate_tls_offset returns 0": but rseq_locate_tls_offset() is only called if rseq_is_registered_for_current_thread() returns true, so it sounds like your app is indeed using rseq. If you run your app under strace do you see the rseq syscall? Why -disable_rseq didn't work around whatever the problem is is strange.

@VulnDetector
Copy link
Author

Thanks for your reply. @derekbruening
I run the command: drrun -debug -loglevel 4 -- ./test, and get the debug log, which is in the attachment test.0.21571.log.
test.0.21571.log
Because I don’t have DR debug experience, I can’t analyze what is the problem.

In the fortigate environment, there is no strace command. So I run command: strace -c ./test in a ubuntu vm, and the result is in: strace_result.txt.
strace_result.txt

From the result, it doesn’t call the rseq syscall. I have to explain what I said :

Function rseq_locate_tls_offset returns 0".

It’s not I debug the DR and get the return value of the rseq_locate_tls_offset() , but I search the error sentence in DR sourcecode, and find the position where it may cause the error sentence.

And what you said is also my concern. I think it might be the environment(sreq support environment) but not the app binary itself causing DR to call a rseq_locate_tls_offset(), because all the app on the fortigate meets the same problem. And the option -disable_rseq’s work procedure may have some problem.

@derekbruening
Copy link
Contributor

I would suggest either stepping through rseq_is_registered_for_current_thread in a debugger or adding printing to see what the rseq syscall returns in your fortigate vm. Does it return -EINVAL even when there is no other rseq registered?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants