Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAN fixes on Ubuntu 22 #4062

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

stevendore
Copy link
Contributor

@stevendore stevendore commented Feb 23, 2024

Fix up some issues when configuring varnish with ASAN, UBSAN and workspace emulator on Ubuntu22. I tested locally, with all tests passing, on Bookworm as well since that is what CCI uses for the SAN builds.

  1. 109543e- Initialize the VDP CTX in Req_new(). I think just using INIT_OBJ is okay here and aligns with the rest of the structs initialized in this section.
  2. 6ea6c93- Increase thread_pool_stack in e29.vtc. Without the increase the test would segfault on Ubuntu 22.
  3. 569c68a - Free value created from create_bogo_n_arg(). Let me know if there is a more desired way to structure the temporary value to free.

Ubsan error from Req_new():

In function ‘memset’,
    inlined from ‘Req_New’ at cache/cache_req.c:176:2:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:59:10: error: ‘__builtin_memset’ offset [0, 63] is out of the bounds [0, 0] [-Werror=array-bounds]
   59 |   return __builtin___memset_chk (__dest, __ch, __len,
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   60 |                                  __glibc_objsize0 (__dest));
      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~

Ubsan error from exec_file:

vtc.c: In function ‘exec_file’:
vtc.c:602:9: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
  602 |         macro_def(vltop, NULL, "tmpdir", "%s", tmpdir);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[4]: *** [Makefile:851: varnishtest-vtc.o] Error 1

@stevendore stevendore force-pushed the ubuntu_san_fix branch 2 times, most recently from e3db989 to 569c68a Compare February 23, 2024 19:39
@@ -16,7 +16,7 @@ server s1 {
chunkedlen 0
} -start

varnish v1 -cliok "param.set thread_pool_stack 80k"
varnish v1 -cliok "param.set thread_pool_stack 180k"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change defeats the purpose of the test.
Should we skip it for ASAN runs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right that also makes sense, my mistake. I will happily update my PR either way but will try to find some time to test out other solutions before skipping on my own. I am guessing since asan just takes up more stack space and different amount on different systems, it is being cranky.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add either feature !asan, feature !msan or feature !sanitizer to the test case if there is no convenient solution for this one.

Copy link
Contributor Author

@stevendore stevendore Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it is possible or overly specific but would it make sense to only skip the test varnish was compiled with gcc and a sanitizer since it does pass fine on clang?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting back to this, it looks like using GCC (11.4.0) with both asan and ubsan causes the failure. When using either asan or ubsan alone the test passes. This test seems like a much more nice to have with sanitizers on. Any thoughts? Just deal with the failure by saying it should be running with clang when using sanitizers?

@@ -173,7 +173,7 @@ Req_New(struct sess *sp)
p = (void*)PRNDUP(p + sizeof(*req->vfc));

req->vdc = (void*)p;
memset(req->vdc, 0, sizeof *req->vdc);
INIT_OBJ(req->vdc, VDP_CTX_MAGIC);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only add a magic value in the respective struct's initializer, in this case VDP_Init() or such initializer does not exist as a function and initialization is basically just the INIT_OBJ(). This is the case for the other structs initialized here.

Can we find a better way to make sanitizers happy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, I will try some other options unless something specific is suggested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we find a better way to make sanitizers happy?

I understand that mempools already back our main workspaces, but my suggestion would be to make the mempools have their own "mpl" workspace. This way we could allocate struct req, struct busyobj, and other things as workspace allocations as we do today, and allocate the task workspace from whatever remains from the mempool workspace.

This way, when working with sanitizers, the workspace emulator would turn those pointer-arithmetic pseudo-allocations we have today into individual mallocs and we'd finally be able to trigger an overflow in structs such as req and busyobj.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helpful clarification from bugwash: What Dridi means is req = WS_Alloc(sizeof *req)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I haven't had much time to get back to this. I'm not sure initially where we'd get the workspace from. The session? Since this function is the one that initializes the workspace itself.

Noted in here: #4062 (comment) I realized the issue is more from using (my version of) GCC with ubsan than my distro itself. So using ZERO_OBJ instead of an explicit memset passes compilation fine. Though this feels more like tricking the compiler but the compiler was not fully right in the first place.

@nigoroll nigoroll self-assigned this Feb 26, 2024
@stevendore
Copy link
Contributor Author

stevendore commented Feb 29, 2024

Looking further into 109543e, the compilation errors only happen when using GCC. As for 6ea6c93 this passes fine with clang instead of GCC as well. I should have been using clang. I can remove these commit all together from the PR.

For 569c68a The test that triggered the leak under gcc but not clang. It looks pretty clearly like a leak technically so maybe clang is just being smarter?

Any opinions on how to move forward?

Edit: tests/u00000.vtc and tests/m00003.vtc are the tests that show the leak for the create_bogo_n_arg() leak on gcc only.

When running with ASAN and UBSAN on ubuntu 22 u00000.vtc and m00003.vtc
fails from a leak from the return value of create_bogo_n_arg() in
mgt_main.c. Freeing the returned value allows tests to pass.
This test fails when compiling varnish with GCC (11.4.0) with ASAN and
UBSAN enabled. This test passes when only one of ASAN or UBSAN are
enabled. It also passes when both are enabled when using clang.
@stevendore
Copy link
Contributor Author

Force pushed to use ZERO_OBJ instead of INIT_OBJ to clear req->vdc and skip e29.vtc when using sanitizers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants