-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw: Close datalog write hole #56499
base: main
Are you sure you want to change the base?
rgw: Close datalog write hole #56499
Conversation
fd2f1bc
to
ade3a6f
Compare
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
eb6c991
to
b1570cb
Compare
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
87c1016
to
4ea1c77
Compare
4ea1c77
to
4deb05c
Compare
600c530
to
95455d5
Compare
95455d5
to
2710fb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not done yet, but sharing initial review up to commit 'common/async: Non-blocking condition variable'
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more review, haven't gotten to the recovery bits yet
876025d
to
c8b3e9f
Compare
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Since reading CLS values in a friendly way requires calling out to RADOS then decoding the returned structure, make a function to help with that. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
We should not be using std::list everywhere, and this is an excellent time to switch. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
This stores a collection of key-based semaphores, intended to serve as a backend to the project to close the datalog write-hole. Signed-off-by: Adam Emerson <aemerson@redhat.com>
Signed-off-by: Adam Emerson <aemerson@redhat.com>
Signed-off-by: Adam Emerson <aemerson@redhat.com>
Since we're plumbing the driver around and it's where our neorados handle lives. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Rewrite all of `RGWDataChangesLog` and supporting classes to use non-blocking, C++20 coroutines. Make interfaces for `optional_yield` and `librados::AioCompletion`. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam Emerson <aemerson@redhat.com>
Except in one particular case. This is necessary to close the data replication write hole, so that if we fail to write to the semaphore set, the client will receive an error. Signed-off-by: Adam Emerson <aemerson@redhat.com>
So we can reconstitute bucket shards from listings during recovery. Signed-off-by: Adam Emerson <aemerson@redhat.com>
Increment in add_entry, decrement in renew_entry, and recover on startup. Signed-off-by: Adam Emerson <aemerson@redhat.com>
Signed-off-by: Adam Emerson <aemerson@redhat.com>
Signed-off-by: Adam Emerson <aemerson@redhat.com>
Also tests renew_entry, backend, and list. Signed-off-by: Adam Emerson <aemerson@redhat.com>
Since, outside of testing, it's only called from stackful coroutines, for now. Signed-off-by: Adam Emerson <aemerson@redhat.com>
This works fine on GCC11, which is the actual target we use for builds on Shaman. Signed-off-by: Adam Emerson <aemerson@redhat.com>
Process shards piecewise. Send each set of shards in a window to the other RGWs and have them acknowledge having it or not. Signed-off-by: Adam Emerson <aemerson@redhat.com>
6b3a956
to
28ae31c
Compare
add_executable(unittest_async_max_concurrent_for_each test_async_max_concurrent_for_each.cc) | ||
add_ceph_unittest(unittest_async_max_concurrent_for_each) | ||
target_link_libraries(unittest_async_max_concurrent_for_each ceph-common Boost::system Boost::context) | ||
|
||
if(NOT WIN32) | ||
add_executable(unittest_async_co_throttle test_async_co_throttle.cc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned that we may hit the seg faults again later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbodley Could you take a look at this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we prevent windows from running this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch already skips the test. My concern was that we may see the segmentation faults on Windows if this async coroutine throttling code gets used by librados, librbd or libcephfs.
I'm wondering if it's caused by the llvm version that we're using.
jenkins test make check |
jenkins test make check arm64 |
jenkins test api |
A specter is haunting RGW.
The specter of objects not syncing if RGW crashes between adding them to to the set of bucketshards to be refreshed and actually writing the entry into FIFO.
This PR will close that write hole.
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
x
between the brackets:[x]
. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e