Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory.cc: fix cross-shard shrinking realloc #2217

Closed

Conversation

travisdowns
Copy link
Contributor

Cross-shard shrinking realloc crashes because we assert that the incoming pointer is shard local inside the shrink method but there is no particular reason to assume this is the case with a realloc: the initial allocation may have been made on another shard.

Fix this by falling through to the free/malloc/memcpy path. This also means that realloc using the same size is a way to "rehome" a possibly foreign pointer: this does nothing if the pointer is already local but it will allocate a local pointer and copy the allocation contents if not.

Fixes #2202.

@@ -2286,8 +2286,6 @@ void* realloc(void* ptr, size_t size) {
abort();
}
// if we're here, it's a non-null seastar memory ptr
// or original functions aren't available.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the comment was obsolete after some changes to the logic above which means we don't reach here in the "original functions not available" case.

do_xshard_realloc(cross_shard, 0, 1);
do_xshard_realloc(cross_shard, 1, 0);
do_xshard_realloc(cross_shard, 50, 100);
do_xshard_realloc(cross_shard, 100, 50);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case (among others) failed with an assertion failure before the fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: In such cases I like to add a code comment (not a github comment) saying something like:

// reproduces issue #2202:

It can be useful to know why a test exists, and also if it fails on some old branch or something, you will know which fix you forgot to include.

@tchaikov tchaikov self-requested a review May 5, 2024 12:50
Copy link
Contributor

@xemul xemul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd appreciate one (or more) more pair of eyes on it

Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments.

@@ -19,11 +19,13 @@
* Copyright (C) 2015 Cloudius Systems, Ltd.
*/

#include "seastar/core/shard_id.hh"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be more consistent, could you please use brackets here and probably order it alphabetically. so it like:

#include <seastar/core/memory.hh>
#include <seastar/core/shard_id.hh>
#include <seastar/core/smp.hh>
#include <seastar/core/temporary_buffer.hh>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time for clang-format in seastar, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in b7e6a4d. In my defense this was auto-added by my IDE which I guess doesn't understand such concerns :).

Copy link
Contributor

@tchaikov tchaikov May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time for clang-format in seastar, WDYT?

yeah, actually, i am all for it. (just not sure about the rigid yet merciful maintainers). if every goes well, i would like to use clang-tidy to order the #includes , and clang-include-cleaner to keep the #include in a good shape.

// Tests that realloc seems to do the right thing with various sizes of
// buffer, including cases where the initial allocation is on another
// shard.
// Needs at least 2 shards to usefull test the cross-shard aspect but
Copy link
Contributor

@tchaikov tchaikov May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you help me understand what "usefull test" means? is it a typo? or it's just me who have trouble understanding this. if "usefull" is removed, i would be able to comprehend this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be usefully, LMK if it makes sense and I will fix it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed to usefully in b7e6a4d, let me know if it's clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

// shard.
// Needs at least 2 shards to usefull test the cross-shard aspect but
// still passes when only 1 shard is used.
auto do_xshard_realloc = [=](bool cross_shard, size_t initial_size, size_t realloc_size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to capture with =. actually, nothing gets captured.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, removed in push b7e6a4d

@@ -155,6 +157,59 @@ SEASTAR_TEST_CASE(test_memory_diagnostics) {
return make_ready_future<>();
}

SEASTAR_THREAD_TEST_CASE(test_cross_thread_realloc) {
Copy link
Contributor

@tchaikov tchaikov May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the purpose of this test is to verify the behavior of seastar allocator, not the one provided by libc, so probably we should guard it like

#ifndef SEASTAR_DEFAULT_ALLOCATOR

SEASTAR_THREAD_TEST_CASE(test_cross_thread_realloc) {
   // ...
}
#endif

i.e., to move the #ifndef SEASTAR_DEFAULT_ALLOCATOR at the end of this test up before it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deliberately left in both modes, because it works in both modes. What's the harm? Leaving it in both modes has some advantages two major ones which are:

  • the test will run in ASAN etc which is only available in the non-ss-allocator modes, to catch bugs in it
  • be able to catch invalid assumptions even in debug mode (which is nice for dev for several reasons)

I have been mostly under the assumption that we should gate tests with SEASTAR_DEFAULT_ALLOCATOR only if they would fail or fail to compile in the default allocator mode.

Copy link
Contributor

@tchaikov tchaikov May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Makes sense. As we also have tests testing std::string when testing seastar::sstring for the similar reasons. So I don't insist.

Cross-shard shrinking realloc crashes because we assert that the
incoming pointer is shard local inside the shrink method but there is
no particular reason to assume this is the case with a realloc: the
initial allocation may have been made on another shard.

Fix this by falling through to the free/malloc/memcpy path. This also
means that realloc using the same size is a way to "rehome" a possibly
foreign pointer: this does nothing if the pointer is already local but
it will allocate a local pointer and copy the allocation contents if
not.

Fixes scylladb#2202.
Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tchaikov
Copy link
Contributor

@scylladb/seastar-maint hello maintainers, could you help merge this change?

do_xshard_realloc(cross_shard, 0, 1);
do_xshard_realloc(cross_shard, 1, 0);
do_xshard_realloc(cross_shard, 50, 100);
do_xshard_realloc(cross_shard, 100, 50);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: In such cases I like to add a code comment (not a github comment) saying something like:

// reproduces issue #2202:

It can be useful to know why a test exists, and also if it fails on some old branch or something, you will know which fix you forgot to include.

@nyh nyh closed this in 2bc4f22 Jun 2, 2024
@travisdowns
Copy link
Contributor Author

@nyh wrote:

Nitpick: In such cases I like to add a code comment (not a github comment) saying something like:

Good point, thanks for the feedback and I'll try to incorporate this in the future!

@travisdowns travisdowns deleted the td-scylla-xshard-realloc branch June 3, 2024 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cross-shard realloc shrink crashes
4 participants