Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mds: set the correct WRLOCK flag always in wrlock_force() #57085

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lxbsz
Copy link
Member

@lxbsz lxbsz commented Apr 25, 2024

The wrlock is not like the xlock, which needs to be acquired in
the CInode's auth always, and it is based on the CDir's auths instead.

When a remote_wrlock is acquired and the local MDS will add a lock
item and marks it as REMOTE_WRLOCK. And later when the local MDS try
to force wrlock in the emplace_lock() will just return the existing
lock item without updating the WRLOCK flag. So when cleaning the
requests later it will just release the remote locks and then removes
lock items directly, which will miss releasing the local wrlock
reference.

Fixes: https://tracker.ceph.com/issues/65630

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@lxbsz lxbsz requested a review from a team April 25, 2024 04:07
@github-actions github-actions bot added the cephfs Ceph File System label Apr 25, 2024
@@ -1832,7 +1832,8 @@ void Locker::wrlock_force(SimpleLock *lock, MutationRef& mut)
dout(7) << "wrlock_force on " << *lock
<< " on " << *lock->get_parent() << dendl;
lock->get_wrlock(true);
mut->emplace_lock(lock, MutationImpl::LockOp::WRLOCK);
auto it = mut->emplace_lock(lock, MutationImpl::LockOp::WRLOCK);
it->flags |= MutationImpl::LockOp::WRLOCK; // may already remote_wrlocked
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at Server::rdlock_two_paths_xlock_destdn and found it strange that this method is grabbing a remote_wrlock at all on an inode the MDS may be auth for. It's checking if the directory auth is different (as in the case in the bug) but then grabs a remote_wrlock on the inode's nestlock -- but it is auth for the inode. The check seems odd but maybe that's just how it's supposed to work; it's been that way for a while. I don't get it though.

Copy link
Member Author

@lxbsz lxbsz Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directory is fragmented and the fragmented CDirs' auths could be different with the CInode's. In Server::rdlock_two_paths_xlock_destdn() it will try to grab remote_wrlock for the CDir's in the remote auth MDS. While the remote MDS isn't the auth of the CInode.

While the local wrlock is for the CInode's auth in local MDS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directory is fragmented and the fragmented CDirs' auths could be different with the CInode's.

Correct.

In Server::rdlock_two_paths_xlock_destdn() it will try to grab remote_wrlock for the CDir's in the remote auth MDS. While the remote MDS isn't the auth of the CInode.

That's the the thing though: a CDir has no locks. Those "remote wrlocks" are actually for the inode which the MDS may be auth for. So why would it request a remote_wrlock on a lock with a parent (CInode) that it is auth for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is beyond my knowledge about this. What I get from the code before is that the remote_wrlock will always in the CInode's auth. I need to check the history of the CDir case to get why.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was first introduced by:

commit d72bdab7d7c74148cbb119a0e7c1eb14a0432d46
Author: Sage Weil <sweil@redhat.com>
Date:   Fri Jul 8 09:32:04 2011 -0700

    mds: take a remote_wrlock on srcdir for cross-mds rename
    
    This ensures that we hold a wrlock on the srcdn auth when the slave
    makes it's changes to the src directory, and prevents us from corrupting
    the scatterlock state.
    
    Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

And by going through the code the wrlock is not like the xlock, which need to do it in the CInode's auth always. While the wrlock is based on the CDir's auths instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good. Please update the code with a comment explanation Xiubo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@lxbsz
Copy link
Member Author

lxbsz commented Apr 30, 2024

jenkins retest this please

The wrlock is not like the xlock, which needs to be acquired in
the CInode's auth always, and it is based on the CDir's auths instead.

When a remote_wrlock is acquired and the local MDS will add a lock
item and marks it as REMOTE_WRLOCK. And later when the local MDS try
to force wrlock in the emplace_lock() will just return the existing
lock item without updating the WRLOCK flag. So when cleaning the
requests later it will just release the remote locks and then removes
lock items directly, which will miss releasing the local wrlock
reference.

Fixes: https://tracker.ceph.com/issues/65630
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cephfs Ceph File System
Projects
None yet
3 participants