Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulnerability: resource exhaustion attack when built with asan; denial of service when built with msan #1013

Open
iphydf opened this issue Mar 23, 2022 · 5 comments

Comments

@iphydf
Copy link
Contributor

iphydf commented Mar 23, 2022

When sending a message like ddff000000, msgpack-c and msgpack-cpp allocate about 96GB of memory:

  1. Load a large 32 bit int (0xff000000):
    _msgpack_load32(uint32_t,n,&tmp);
  2. Call callback with the large int:
    ret = msgpack_unpack_callback(func)(user, count_, &stack[top].obj); \
  3. Compute allocation size (0xff000000 * 24):
    size = n * sizeof(msgpack_object);
  4. Allocate 96GB:
    o->via.array.ptr = (msgpack_object*)msgpack_zone_malloc(*u->z, size);

In most situations, this is not a problem, because that memory isn't written to, so it's never mapped. However, if someone runs a networked program (e.g. a chat client using msgpack) under MemorySanitizer or AddressSanitizer, one of two things will happen:

  1. Under msan, the user will see ERROR: MemorySanitizer: requested allocation size 0x2000000008 exceeds maximum supported size of 0x200000000 and the program will abort.
  2. Under asan, the system will attempt to actually allocate 96GB of memory, because asan writes to all the allocated memory in __interceptor_malloc -> asan_malloc. If the user has swap enabled, this will soon cause thrashing, rendering the computer unusable. If not, the Linux OOM killer will come along and kill processes until more memory is available (hopefully it kills the chat client).

As an alternative, the code could allocate a smaller array at first and then exponentially grow it until it reaches the given size (similar to std::vector::push_back). If performance is a concern, you could limit this behaviour to arrays larger than some number (e.g. only for array_32s).

@redboltz
Copy link
Contributor

For C++, you can limit the size as follows:
https://github.com/msgpack/msgpack-c/wiki/v2_0_cpp_unpacker#limit-size-of-elements

@iphydf
Copy link
Contributor Author

iphydf commented Mar 24, 2022

We're using the C version of the library. Size limit is an option (if implemented in the C version), but it's a bit crude, because when you're actually reading large data, you may want to be able to read all of that. The problem here is that small data (5 bytes of input) can cause really large allocations (96 GB) in the parser. This makes a resource exhaustion or DoS attack very cheap. Size limits additionally restrict how much data a benign client can send. For the most part, it's reasonable to pass a 96GB input to the parser and expect it to cause a 96GB allocation. Passing such data is something the client code can easily control and decide not to process if the size is not reasonable for the application. The problem here is that the client cannot easily verify whether there's possibly going to be some problem with large allocations in the parser.

So, possible fixes:

  • Allocate increasingly large arrays instead of everything at once. This has some issues with the zone stuff in msgpack, so you'd probably want to determine whether that strategy will be chosen, and in that case use malloc() and free() for all but the last allocation to avoid O(n log n) space complexity of a parse.
  • Don't parse objects at all, and instead expose unpacking primitives similar to the packing primitives so no allocation happens at all and the unpacking stack management is up to the client code (e.g. via C call stack or an explicit stack).

@redboltz
Copy link
Contributor

I'm not sure about C version. But if limit size is good enough, MSGPACK_XXX_LIMIT macro is acceptable. But as you mentioned it's a bit crude. C++ is using limit and it is good enough, so far.

For C version pull request is welcome but you need to preserve the current behavior. Behavior selecting preprosessor macro (by default disabled is acceptable).

Allocate increasingly large arrays instead of everything at once.

I think that the C++ current implementation do that using zone and unpacking buffer.
See https://github.com/msgpack/msgpack-c/blob/cpp_master/include/msgpack/v2/parse.hpp#L867-L884
The basic strategy is allocating the current size * 2. It means "Allocate increasingly large arrays".
I'm not sure but I think that C version doing the similar approach.
It is a little bit old presentation but useful to understand unpaking logic: https://www.slideshare.net/taka111/msgpackc

@iphydf
Copy link
Contributor Author

iphydf commented Mar 24, 2022

static_cast<msgpack::object*>(m_zone->allocate_align(size, MSGPACK_ZONE_ALIGNOF(msgpack::object)));

The c++ version allocates the whole array at once, as well. The code to pointed at is the parse buffer (which is indeed managed with exponential growth), not the resulting object.

@redboltz
Copy link
Contributor

static_cast<msgpack::object*>(m_zone->allocate_align(size, MSGPACK_ZONE_ALIGNOF(msgpack::object)));

The c++ version allocates the whole array at once, as well. The code to pointed at is the parse buffer (which is indeed managed with exponential growth), not the resulting object.

Ah, I understand. Thank you for clarifying.


  • Don't parse objects at all, and instead expose unpacking primitives similar to the packing primitives so no allocation happens at all and the unpacking stack management is up to the client code (e.g. via C call stack or an explicit stack).

I think that C++ version (since v2 API) has it. C++ version has parser and object_create_visitor. It used to be mixed as unpacker but now separated.
If a user wants to advanced creating object method, then the user can write thir own visitor that allocate ARRAY/MAP of object increasingly.
That means the default create_object_visitor allocated ARRAY/MAP objects all at once but the user can provide customized visitor.
If increasingly ARRAY/MAP allocating visitor is common enough, then it can be a part of the library.

For C, there is no such mechanism, so far. I think that adding the functionality is reasonable way.
I guess that it is pretty difficult because the C implentation is a little (?) complecated.

https://github.com/msgpack/msgpack-c/blob/c_master/include/msgpack/unpack.h
https://github.com/msgpack/msgpack-c/blob/c_master/include/msgpack/unpack_define.h
https://github.com/msgpack/msgpack-c/blob/c_master/include/msgpack/unpack_template.h


  • Allocate increasingly large arrays instead of everything at once. This has some issues with the zone stuff in msgpack, so you'd probably want to determine whether that strategy will be chosen, and in that case use malloc() and free() for all but the last allocation to avoid O(n log n) space complexity of a parse.

If the current implementation is preserved as the default behavior, and the new increasingly allocating way is provided optionally, it is good.
I guess that one of the reason introducing zone (memory pool) is avoiding number of malloc/free calls.
( For C++, make object's destructor trivial is more important reason. Non trivial recursively called destructor cannot be inlined and it makes a big performance penalty. But it is C++ story)


Am I understanding about your two options correctly ?

I think that separating parse and creating object is better. But I guess that it requires many code changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants