Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ndctl keys not removed after ndctl sanitize-dimm nmem0 --overwrite #239

Open
yizhanglinux opened this issue May 4, 2023 · 18 comments
Open

Comments

@yizhanglinux
Copy link
Contributor

Hello
I found after sanitize-dimm --overwrite operation, the ndctl key still existing there and not removed, is it by design, but from the man page, the key should be removed after sanitize-dimm operation.

From man ndctl sanitize-dimm
Additionally, after completion of this command, the security and passphrase for the given NVDIMM will be disabled, and the passphrase and any key material will also be removed from the keyring and the ndctl keys directory at /etc/ndctl/keys
# ndctl setup-passphrase "$dev" -k user:"$masterkey"
passphrase enabled for 1 nmem.
# ndctl sanitize-dimm nmem0 --overwrite
overwrite issued for 1 nmem.
# ndctl list -Di
[
  {
    "dev":"nmem1",
    "id":"8089-a2-1833-00000510",
    "handle":257,
    "phys_id":32,
    "flag_failed_map":true,
    "security":"disabled"
  },
  {
    "dev":"nmem3",
    "id":"8089-a2-1833-00000497",
    "handle":4353,
    "phys_id":44,
    "security":"disabled"
  },
  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":1,
    "phys_id":26,
    "state":"disabled",
    "security":"overwrite"
  },
  {
    "dev":"nmem2",
    "id":"8089-a2-1833-000004a9",
    "handle":4097,
    "phys_id":38,
    "security":"disabled"
  }
]
# ndctl wait-overwrite nmem0
# ls /etc/ndctl/keys/
keys.readme
nvdimm_8089-a2-1833-000004a3_intel-purley-04.khw1.lab.eng.bos.redhat.com.blob
nvdimm-master.blob
@yizhanglinux
Copy link
Contributor Author

if I only do sanitize-dimm operation, the key can be removed.

# ndctl sanitize-dimm nmem0
sanitized 1 nmem.

# ls /etc/ndctl/keys/
keys.readme  nvdimm-master.blob

@davejiang
Copy link
Collaborator

https://lore.kernel.org/nvdimm/168357518158.2750073.1393407560977941832.stgit@djiang5-mobl3/

Can you please try this fix and see if that does the job? Thanks!

@yizhanglinux
Copy link
Contributor Author

https://lore.kernel.org/nvdimm/168357518158.2750073.1393407560977941832.stgit@djiang5-mobl3/

Can you please try this fix and see if that does the job? Thanks!

Hi Dave

I tried your patch, after ndctl sanitize-dimm nmem0 --overwrite operation[1], the overwrite still issued to nmem0, I filed another issue[2], finally the key was removed, but the dimm nmem0 stays "unlocked" state[3] and the security cannot be disabled on nmem0[4], now seems I can do nothing to disable the security. :(

[1]

# ./ndctl sanitize-dimm nmem0 --overwrite
libndctl: ndctl_dimm_enable: nmem0: failed to enable
overwrite issued for 0 nmem.

[2]
#244

[3]

# ndctl  list -Di
[
  {
    "dev":"nmem1",
    "id":"8089-a2-1833-00000510",
    "handle":257,
    "phys_id":32,
    "flag_failed_map":true,
    "security":"disabled"
  },
  {
    "dev":"nmem3",
    "id":"8089-a2-1833-00000497",
    "handle":4353,
    "phys_id":44,
    "security":"disabled"
  },
  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":1,
    "phys_id":26,
    "state":"disabled",
    "security":"unlocked"
  },
  {
    "dev":"nmem2",
    "id":"8089-a2-1833-000004a9",
    "handle":4097,
    "phys_id":38,
    "security":"disabled"
  }
]

[4]

#./ndctl remove-passphrase nmem0
failed to open file /etc/ndctl/keys/nvdimm_8089-a2-1833-000004a3_intel-purley-04.khw1.lab.eng.bos.redhat.com.blob: No such file or directory
Unable to load key
passphrase removed for 0 nmem.

# ./ndctl sanitize-dimm nmem0
failed to open file /etc/ndctl/keys/nvdimm_8089-a2-1833-000004a3_intel-purley-04.khw1.lab.eng.bos.redhat.com.blob: No such file or directory
Unable to load key
sanitized 0 nmem.

@davejiang
Copy link
Collaborator

Do you have CONFIG_NVDIMM_SECURITY_TEST=y in your kernel config? I talked to Vishal and he said it works for him. The only thing I can think of right now is that you don't have that config on and it doesn't do the extra poll to update the security state when using ndtest and therefore it remains in "locked" state.

@yizhanglinux
Copy link
Contributor Author

yizhanglinux commented May 9, 2023

Do you have CONFIG_NVDIMM_SECURITY_TEST=y in your kernel config? I talked to Vishal and he said it works for him. The only thing I can think of right now is that you don't have that config on and it doesn't do the extra poll to update the security state when using ndtest and therefore it remains in "locked" state.

Yes, the CONFIG_NVDIMM_SECURITY_TEST was enabled.
The dimm nmem0 I used is one real nvdimm HW, I also tried modprobe nfit_test and using nmem4 do the same test, it has the same behavior.

# cat .config | grep CONFIG_NVDIMM_SECURITY_TEST
CONFIG_NVDIMM_SECURITY_TEST=y

# ./ndctl list -Di
[
 
  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":1,
    "phys_id":26,
    "state":"disabled",
    "security":"unlocked"
  },
  {
    "dev":"nmem4",
    "id":"cdab-0a-07e0-ffffffff",
    "handle":0,
    "phys_id":0,
    "state":"disabled",
    "security":"unlocked"
  },
  {
    "dev":"nmem6",
    "id":"cdab-0a-07e0-fffeffff",
    "handle":256,
    "phys_id":2,
    "security":"disabled"
  }
]

@davejiang
Copy link
Collaborator

So issue 239, where key blob isn't removed after overwrite, is addressed correct? The remaining issue is 244, where overwrite is issued anyways even though there's error of some sort?

@yizhanglinux
Copy link
Contributor Author

So issue 239, where key blob isn't removed after overwrite, is addressed correct? The remaining issue is 244, where overwrite is issued anyways even though there's error of some sort?

yes, the key was removed finally with your patch.
But the security was enabled:unlocked and cannot be disabled now, it's better we can fix it first(disable the security), or other user maybe also run into such situation.

  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":1,
    "phys_id":26,
    "state":"disabled",
    "security":"unlocked"
  },

For #244, maybe we just need to fix the output. :)

@yizhanglinux
Copy link
Contributor Author

BTW, since my dimm nmem0's security feature was enabled:unlocked and no key now, do you know how to disable the security w/o key.

@davejiang
Copy link
Collaborator

Did you call ndctl wait-overwrite nmem0 to wait for overwrite completion first before checking the state?

@yizhanglinux
Copy link
Contributor Author

Yes, I already called that cmd.

# ndctl wait-overwrite nmem0
# ndctl wait-overwrite nmem4
# ndctl list -Di
[
  {
    "dev":"nmem1",
    "id":"8089-a2-1833-00000510",
    "handle":257,
    "phys_id":32,
    "flag_failed_map":true,
    "security":"disabled"
  },
  {
    "dev":"nmem3",
    "id":"8089-a2-1833-00000497",
    "handle":4353,
    "phys_id":44,
    "security":"disabled"
  },
  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":1,
    "phys_id":26,
    "state":"disabled",
    "security":"unlocked"
  },
  {
    "dev":"nmem2",
    "id":"8089-a2-1833-000004a9",
    "handle":4097,
    "phys_id":38,
    "security":"disabled"
  },
  {
    "dev":"nmem9",
    "id":"cdab-0a-07e0-fefffeff",
    "handle":65537,
    "phys_id":0,
    "flag_failed_map":true
  },
  {
    "dev":"nmem8",
    "id":"cdab-0a-07e0-fffffeff",
    "handle":65536,
    "phys_id":0,
    "flag_failed_save":true,
    "flag_failed_arm":true,
    "flag_failed_restore":true,
    "flag_failed_flush":true,
    "flag_smart_event":true
  },
  {
    "dev":"nmem5",
    "id":"cdab-0a-07e0-feffffff",
    "handle":1,
    "phys_id":1,
    "security":"disabled"
  },
  {
    "dev":"nmem7",
    "id":"cdab-0a-07e0-fefeffff",
    "handle":257,
    "phys_id":3,
    "security":"disabled"
  },
  {
    "dev":"nmem4",
    "id":"cdab-0a-07e0-ffffffff",
    "handle":0,
    "phys_id":0,
    "state":"disabled",
    "security":"unlocked"
  },
  {
    "dev":"nmem6",
    "id":"cdab-0a-07e0-fffeffff",
    "handle":256,
    "phys_id":2,
    "security":"disabled"
  }
]

@davejiang
Copy link
Collaborator

Looking at the DSM 1.8 spec, I'm starting to get the feeling that overwrite does not change the security state of being enabled. And that when I implemented overwrite, maybe there was a reason that the key blob was not removed. Sorry it's been a few years since I looked at this stuff. What happens if you reboot? Does it come back as locked? There may be a way to recover via BIOS reset of the DIMM. Otherwise the DIMM may be unrecoverable. :( Do you still have the Intel contact that you guys originally got the DIMM from?

https://pmem.io/documents/NVDIMM_DSM_Interface-V1.8.pdf

@davejiang
Copy link
Collaborator

Also, is this a Crow Pass on Sapphire Rapids or some other DIMM on a different platform? Trying to find some help internally....

@davejiang
Copy link
Collaborator

If your BIOS has the feature:
Boot to the UEFI menu and enable Secure Erase Unit for the module(s)

  1. UEFI EDKII > Socket Configuration > Memory Configuration > PMem Configuration > PMem Secure Erase Unit
  2. Reset/reboot the system

Otherwise, we may need to investigate other means.

@yizhanglinux
Copy link
Contributor Author

Looking at the DSM 1.8 spec, I'm starting to get the feeling that overwrite does not change the security state of being enabled. And that when I implemented overwrite, maybe there was a reason that the key blob was not removed. Sorry it's been a few years since I looked at this stuff. What happens if you reboot? Does it come back as locked? There may be a way to recover via BIOS reset of the DIMM. Otherwise the DIMM may be unrecoverable. :( Do you still have the Intel contact that you guys originally got the DIMM from?

https://pmem.io/documents/NVDIMM_DSM_Interface-V1.8.pdf

OK, so it's expected to not remove the key with "overwrite" operation, it was locked after reboot.

  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":"0x1",
    "phys_id":"0x1a",
    "state":"disabled",
    "security":"locked"
  },

I will check with our hw team if they can help me reset the DIMM.

@yizhanglinux
Copy link
Contributor Author

If your BIOS has the feature: Boot to the UEFI menu and enable Secure Erase Unit for the module(s)

  1. UEFI EDKII > Socket Configuration > Memory Configuration > PMem Configuration > PMem Secure Erase Unit
  2. Reset/reboot the system

Otherwise, we may need to investigate other means.

It's should be Intel Purley, Wolf Pass, I checked the BIOS and no such option. :(

@davejiang
Copy link
Collaborator

Can you open up an IPS case so Intel can track it? We can look into how to get that DIMM serviced.

@yizhanglinux
Copy link
Contributor Author

Can you open up an IPS case so Intel can track it? We can look into how to get that DIMM serviced.

I've asked our HW team to do that, thanks for the help.

@davejiang
Copy link
Collaborator

Thanks! Sorry about the troubles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants