Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools/cephfs: recover alternate_name of dentries from journal #55792

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

batrick
Copy link
Member

@batrick batrick commented Feb 27, 2024

Right now the cephfs-journal-tool always uses the legacy encoding for dentries which will drop the alternate_name if ever set.

Fixes: https://tracker.ceph.com/issues/64602

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@batrick
Copy link
Member Author

batrick commented Feb 27, 2024

Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because alternate_name is only easy to add within Client tests (src/test/client/) and not in a broader test where disaster recovery is also performed. One option is to add libcephfs/python bindings for a script but that's not trivial. Or a test-specific asok command. Any other thoughts?

@vshankar vshankar requested a review from a team February 28, 2024 13:47
@vshankar
Copy link
Contributor

Right now the cephfs-journal-tool always uses the legacy encoding for dentries which will drop the alternate_name if ever set.

Good catch!

@batrick
Copy link
Member Author

batrick commented Feb 28, 2024

Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because alternate_name is only easy to add within Client tests (src/test/client/) and not in a broader test where disaster recovery is also performed. One option is to add libcephfs/python bindings for a script but that's not trivial. Or a test-specific asok command. Any other thoughts?

@vshankar @lxbsz is the fs:fscrypt suite stable enough for adding a test?

encode('I', dentry_bl);
encode('i', dentry_bl);
ENCODE_START(2, 1, dentry_bl);
encode(fb.alternate_name, dentry_bl);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really the encode order, where Inode has alternate_name at the front and Link has it at the back?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In both cases alternate_name is a piece of dentry metadata. As a primary link, ino/d_type are inferred from the inode that's embedded in the dentry. It doesn't really matter in any case (as inode encoding was also made versioned [1] where before it was not [2]) but made sense to everyone at the time.

[1] https://github.com/ceph/ceph/pull/37297/files#diff-c0b626494569135ea1406cfad7ab9929a4df0a1f7d880405b5d2e734db812a03R2207-R2241
[2] https://github.com/ceph/ceph/pull/37297/files#diff-c0b626494569135ea1406cfad7ab9929a4df0a1f7d880405b5d2e734db812a03L2285-L2309

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct.

@chrisphoffman
Copy link
Contributor

Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because alternate_name is only easy to add within Client tests (src/test/client/) and not in a broader test where disaster recovery is also performed. One option is to add libcephfs/python bindings for a script but that's not trivial. Or a test-specific asok command. Any other thoughts?

I think asok could work in this case. How much encoding validation do we currently do?

Per this here [1] , this field is currently used for supporting long file names on clients who don't know about fscrypt. I proposed a custom client to validate security posture in [2]. Perhaps we add additional tests to this custom client, to further test when that becomes available.

1: https://lwn.net/Articles/843302/
2: #54725 (comment)

@batrick
Copy link
Member Author

batrick commented Mar 15, 2024

Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because alternate_name is only easy to add within Client tests (src/test/client/) and not in a broader test where disaster recovery is also performed. One option is to add libcephfs/python bindings for a script but that's not trivial. Or a test-specific asok command. Any other thoughts?

I think asok could work in this case. How much encoding validation do we currently do?

I'd really rather not create an asok command just for synthetic manipulation of alternate_name of a dentry, if possible.

Per this here [1] , this field is currently used for supporting long file names on clients who don't know about fscrypt. I proposed a custom client to validate security posture in [2]. Perhaps we add additional tests to this custom client, to further test when that becomes available.

1: https://lwn.net/Articles/843302/ 2: #54725 (comment)

No actually alternate_name is only used for the encrypted file name if it's too long for a regular file name. It has a very narrow use-case (atm, samba is thinking about using it for caps-insensitive file names).

@lxbsz
Copy link
Member

lxbsz commented Mar 18, 2024

Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because alternate_name is only easy to add within Client tests (src/test/client/) and not in a broader test where disaster recovery is also performed. One option is to add libcephfs/python bindings for a script but that's not trivial. Or a test-specific asok command. Any other thoughts?

@vshankar @lxbsz is the fs:fscrypt suite stable enough for adding a test?

@batrick Sorry for late reply and I think missed this.

Yeah it is. The #54725 is the last piece of the puzzle. But won't affect this tests IMO.

@batrick batrick force-pushed the i64602 branch 2 times, most recently from 5c49e33 to 5aa7484 Compare May 15, 2024 02:39
@github-actions github-actions bot added the tests label May 15, 2024
@batrick batrick force-pushed the i64602 branch 7 times, most recently from 69a8de6 to 4d9eb97 Compare May 16, 2024 22:05
@batrick
Copy link
Member Author

batrick commented May 17, 2024

Example failure with the new test:

2024-05-16T22:34:21.781 DEBUG:teuthology.orchestra.run.smithi044:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 300 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json dump tree /dir 0
...
[
  {
    "accounted_rstat": {
      "rbytes": 4096,
      "rctime": "2024-05-16T22:34:15.251864+0000",
      "rfiles": 1,
      "rsnaps": 0,
      "rsubdirs": 1,
      "version": 0
    },
    "atime": "2024-05-16T22:34:14.498880+0000",
    "auth_pins": 0,
    "auth_state": {
      "replicas": {}
    },
    "authlock": {},
    "backtrace_version": 14,
    "btime": "2024-05-16T22:34:14.498880+0000",
    "change_attr": 3,
    "client_caps": [
      {
        "client_id": 5184,
        "issued": "pAsLsXsFsx",
        "last_sent": 4,
        "pending": "pAsLsXsFsx",
        "wanted": "pAsLsXsFsxcral"
      }
    ],
    "client_ranges": [],
    "ctime": "2024-05-16T22:34:15.249864+0000",
    "damage_flags": 0,
    "dir_layout": {
      "dir_hash": 2,
      "unused1": 0,
      "unused2": 0,
      "unused3": 0
    },
    "dirfrags": [
      {
        "auth_pins": 0,
        "auth_state": {
          "replicas": {}
        },
        "committed_version": "0",
        "committing_version": "0",
        "dentries": [
          {
            "alternate_name": "bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5/YV7DtpeNPZnorcTqxPM/hExtWHSS4P+S+Dpwj62hMyh/77sGhiW1Filvv1gQjV+sN/GozPNwHgfleadkUs1OkRkYtgWrCjbKP0MayRtiOLrVTRuYyOp/Qt3+XCIyiS87B9bUcOFjWratF+yR0kpJ0RYriix7NKVkBJ0kGWYSCY+PYjiLeMYJBMQcCxW/nwfVku+m6fgFJvb6pjEFxIk9zT5cunSImsjr",
            "auth_pins": 0,
            "auth_state": {
              "replicas": {}
            },
            "inode": 1099511628283,
            "is_auth": true,
            "is_freezing": false,
            "is_frozen": false,
            "is_new": false,
            "is_null": false,
            "is_primary": true,
            "is_remote": false,
            "lock": {},
            "nref": 2,
            "path": "dir/bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5,YV7DtpeNPZnorcTqxPM,hExtWHSS4P+S+Dpwj62hMyh,77sGhiW1Filvv1gQjV+sN,GozPNwHgfleadkUuZ+PMLCaKQXhuid9WvmHanxJnaabYDLj4VEz+EX2WsG",
...
# fail + journal recovery
2024-05-16T22:35:31.077 DEBUG:teuthology.orchestra.run.smithi044:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 300 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json dump tree /dir 0
...
[
  {
    "accounted_rstat": {
      "rbytes": 4096,
      "rctime": "2024-05-16T22:34:15.251864+0000",
      "rfiles": 1,
      "rsnaps": 0,
      "rsubdirs": 1,
      "version": 0
    },
    "atime": "2024-05-16T22:34:14.498880+0000",
    "auth_pins": 0,
    "auth_state": {
      "replicas": {}
    },
    "authlock": {},
    "backtrace_version": 14,
    "btime": "2024-05-16T22:34:14.498880+0000",
    "change_attr": 3,
    "client_caps": [],
    "client_ranges": [],
    "ctime": "2024-05-16T22:34:15.249864+0000",
    "damage_flags": 0,
    "dir_layout": {
      "dir_hash": 2,
      "unused1": 0,
      "unused2": 0,
      "unused3": 0
    },
    "dirfrags": [
      {
        "auth_pins": 0,
        "auth_state": {
          "replicas": {}
        },
        "committed_version": "5",
        "committing_version": "5",
        "dentries": [
          {
            "alternate_name": "",
            "auth_pins": 0,
            "auth_state": {
              "replicas": {}
            },
            "inode": 1099511628283,
            "is_auth": true,
            "is_freezing": false,
            "is_frozen": false,
            "is_new": false,
            "is_null": false,
            "is_primary": true,
            "is_remote": false,
            "lock": {},
            "nref": 2,
            "path": "dir/bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5,YV7DtpeNPZnorcTqxPM,hExtWHSS4P+S+Dpwj62hMyh,77sGhiW1Filvv1gQjV+sN,GozPNwHgfleadkUuZ+PMLCaKQXhuid9WvmHanxJnaabYDLj4VEz+EX2WsG",

/teuthology/pdonnell-2024-05-16_22:05:38-fs:fscrypt-wip-pdonnell-testing-20240515.023933-debug-distro-default-smithi/7709888/teuthology.log

I'll polish this and cleanup the commits.

@batrick batrick force-pushed the i64602 branch 2 times, most recently from 816c2ae to 7cc6bdc Compare May 17, 2024 17:00
@batrick batrick requested a review from vshankar May 17, 2024 17:01
@batrick
Copy link
Member Author

batrick commented May 17, 2024

@lxbsz @vshankar PTAL

@batrick
Copy link
Member Author

batrick commented May 17, 2024

src/tools/cephfs/JournalTool.cc Show resolved Hide resolved

self.fs.fail()

self.fs.journal_tool(['event', 'recover_dentries', 'list'], 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm running this patch on my local dev machine. It gets to this line and fails during decode(). Have you run this on teuthology? Can you share a log for me to look at?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the example without the fix here:

#55792 (comment)

but I don't have a link on hand with this fix.

Are you running an old version of the journal-tool perhaps?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've ran it via vstart_runner within branch: i64602.

Also ran the command manually:
./bin/cephfs-journal-tool --debug-mds=20 --debug-ms=1 --debug-objecter=1 --rank cephfs:0 event recover_dentries list

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll run this through QA and see what we find.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yes, there is a decode error: https://pulpito.ceph.com/pdonnell-2024-05-21_20:54:38-fs:fscrypt-wip-pdonnell-testing-20240521.172753-debug-distro-default-smithi/

It doesn't make much sense to me. I'll reproduce locally and gdb it.

@batrick batrick force-pushed the i64602 branch 2 times, most recently from 0b35eab to 85a19d4 Compare May 21, 2024 13:05
@batrick
Copy link
Member Author

batrick commented May 21, 2024

jenkins test make check arm64

@batrick
Copy link
Member Author

batrick commented May 21, 2024

This PR is under test in https://tracker.ceph.com/issues/66167.

rpm:
- fscrypt
deb:
- fscrypt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I remembered in some old distros the fscrypt package couldn't be found and installed. But for the latest distros we use in qa should be fine.

Let's just keep this in mind.

@batrick batrick marked this pull request as draft May 22, 2024 15:26
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Test without the fix:

    2024-05-16T22:34:21.781 DEBUG:teuthology.orchestra.run.smithi044:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 300 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json dump tree /dir 0
    ...
    [
      {
        "accounted_rstat": {
          "rbytes": 4096,
          "rctime": "2024-05-16T22:34:15.251864+0000",
          "rfiles": 1,
          "rsnaps": 0,
          "rsubdirs": 1,
          "version": 0
        },
        "atime": "2024-05-16T22:34:14.498880+0000",
        "auth_pins": 0,
        "auth_state": {
          "replicas": {}
        },
        "authlock": {},
        "backtrace_version": 14,
        "btime": "2024-05-16T22:34:14.498880+0000",
        "change_attr": 3,
        "client_caps": [
          {
            "client_id": 5184,
            "issued": "pAsLsXsFsx",
            "last_sent": 4,
            "pending": "pAsLsXsFsx",
            "wanted": "pAsLsXsFsxcral"
          }
        ],
        "client_ranges": [],
        "ctime": "2024-05-16T22:34:15.249864+0000",
        "damage_flags": 0,
        "dir_layout": {
          "dir_hash": 2,
          "unused1": 0,
          "unused2": 0,
          "unused3": 0
        },
        "dirfrags": [
          {
            "auth_pins": 0,
            "auth_state": {
              "replicas": {}
            },
            "committed_version": "0",
            "committing_version": "0",
            "dentries": [
              {
                "alternate_name": "bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5/YV7DtpeNPZnorcTqxPM/hExtWHSS4P+S+Dpwj62hMyh/77sGhiW1Filvv1gQjV+sN/GozPNwHgfleadkUs1OkRkYtgWrCjbKP0MayRtiOLrVTRuYyOp/Qt3+XCIyiS87B9bUcOFjWratF+yR0kpJ0RYriix7NKVkBJ0kGWYSCY+PYjiLeMYJBMQcCxW/nwfVku+m6fgFJvb6pjEFxIk9zT5cunSImsjr",
                "auth_pins": 0,
                "auth_state": {
                  "replicas": {}
                },
                "inode": 1099511628283,
                "is_auth": true,
                "is_freezing": false,
                "is_frozen": false,
                "is_new": false,
                "is_null": false,
                "is_primary": true,
                "is_remote": false,
                "lock": {},
                "nref": 2,
                "path": "dir/bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5,YV7DtpeNPZnorcTqxPM,hExtWHSS4P+S+Dpwj62hMyh,77sGhiW1Filvv1gQjV+sN,GozPNwHgfleadkUuZ+PMLCaKQXhuid9WvmHanxJnaabYDLj4VEz+EX2WsG",
    ...
    # fail + journal recovery
    2024-05-16T22:35:31.077 DEBUG:teuthology.orchestra.run.smithi044:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 300 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json dump tree /dir 0
    ...
    [
      {
        "accounted_rstat": {
          "rbytes": 4096,
          "rctime": "2024-05-16T22:34:15.251864+0000",
          "rfiles": 1,
          "rsnaps": 0,
          "rsubdirs": 1,
          "version": 0
        },
        "atime": "2024-05-16T22:34:14.498880+0000",
        "auth_pins": 0,
        "auth_state": {
          "replicas": {}
        },
        "authlock": {},
        "backtrace_version": 14,
        "btime": "2024-05-16T22:34:14.498880+0000",
        "change_attr": 3,
        "client_caps": [],
        "client_ranges": [],
        "ctime": "2024-05-16T22:34:15.249864+0000",
        "damage_flags": 0,
        "dir_layout": {
          "dir_hash": 2,
          "unused1": 0,
          "unused2": 0,
          "unused3": 0
        },
        "dirfrags": [
          {
            "auth_pins": 0,
            "auth_state": {
              "replicas": {}
            },
            "committed_version": "5",
            "committing_version": "5",
            "dentries": [
              {
                "alternate_name": "",
                "auth_pins": 0,
                "auth_state": {
                  "replicas": {}
                },
                "inode": 1099511628283,
                "is_auth": true,
                "is_freezing": false,
                "is_frozen": false,
                "is_new": false,
                "is_null": false,
                "is_primary": true,
                "is_remote": false,
                "lock": {},
                "nref": 2,
                "path": "dir/bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5,YV7DtpeNPZnorcTqxPM,hExtWHSS4P+S+Dpwj62hMyh,77sGhiW1Filvv1gQjV+sN,GozPNwHgfleadkUuZ+PMLCaKQXhuid9WvmHanxJnaabYDLj4VEz+EX2WsG",

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Right now the cephfs-journal-tool always uses the legacy encoding for dentries
which will drop the alternate_name if ever set.

Fixes: 39f3440
Fixes: https://tracker.ceph.com/issues/64602
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cephfs Ceph File System tests tools
Projects
None yet
5 participants