Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PAL/Linux-SGX] Re-init attestation key after AESM restarted #1881

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dimakuv
Copy link
Contributor

@dimakuv dimakuv commented May 13, 2024

Description of the changes

Gramine uses the AESM daemon to request SGX Quotes from the Quoting Enclave. AESM runs in the background and may be restarted. For some reason, after a restart AESM "forgets" the platform's attestation key and requires an additional INIT_QUOTE_REQUEST before it is able to generate SGX Quotes again. Moreover, the Quoting Enclave's Target Info may change at this point, so Gramine must update its cached QE Target Info. This behavior is applicable only to DCAP attestation and is documented in Intel SGX PSW/DCAP software. Gramine didn't implement this behavior and thus failed with "AESM service returned error 42".

This PR complies with the SGX PSW/DCAP documentation and adds the send of INIT_QUOTE_REQUEST + update of QE Target Info + retry of SGX quote retrieval.

This is a re-creation of #1877, with a better design. Closes #1877.

How to test this PR?

Change CI-Examples/python/scripts/sgx-quote.py to this:

import os
import sys
import time

while True:
    if not os.path.exists("/dev/attestation/quote"):
        print("Cannot find `/dev/attestation/quote`; "
                "are you running under SGX, with remote attestation enabled?")
        sys.exit(1)

    with open('/dev/attestation/attestation_type') as f:
        print(f"Detected attestation type: {f.read()}")

    with open("/dev/attestation/user_report_data", "wb") as f:
        f.write(b'\0'*64)

    with open("/dev/attestation/quote", "rb") as f:
        quote = f.read()

    print(f"Extracted SGX quote with size = {len(quote)} and the following fields:")
    print(f"  ATTRIBUTES.FLAGS: {quote[96:104].hex()}  [ Debug bit: {quote[96] & 2 > 0} ]")
    print(f"  ATTRIBUTES.XFRM:  {quote[104:112].hex()}")
    print(f"  MRENCLAVE:        {quote[112:144].hex()}")
    print(f"  MRSIGNER:         {quote[176:208].hex()}")
    print(f"  ISVPRODID:        {quote[304:306].hex()}")
    print(f"  ISVSVN:           {quote[306:308].hex()}")
    print(f"  REPORTDATA:       {quote[368:400].hex()}")
    print(f"                    {quote[400:432].hex()}")

    sys.stdout.flush()
    time.sleep(5)

Apply the above Python script, run it in one terminal, and in another terminal do sudo systemctl restart aesmd. Without this PR, Gramine will fail. With this PR, Gramine will continue execution of the endless loop.


This change is Reviewable

Gramine uses the AESM daemon to request SGX Quotes from the Quoting
Enclave. AESM runs in the background and may be restarted. For some
reason, after a restart AESM "forgets" the platform's attestation key
and requires an additional INIT_QUOTE_REQUEST before it is able to
generate SGX Quotes again. Moreover, the Quoting Enclave's Target Info
may change at this point, so Gramine must update its cached QE Target
Info. This behavior is applicable only to DCAP attestation and is
documented in Intel SGX PSW/DCAP software. Gramine didn't implement this
behavior and thus failed with "AESM service returned error 42".

This commit complies with the SGX PSW/DCAP documentation and adds the
send of INIT_QUOTE_REQUEST + update of QE Target Info + retry of SGX
quote retrieval.

Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
Copy link
Contributor

@llly llly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel)

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 6 files at r1, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)


-- commits line 2 at r1:

Suggestion:

Re-init attestation key after an AESM restart

-- commits line 15 at r1:
ditto later in this sentence

Suggestion:

sending

pal/src/host/linux-sgx/enclave_ocalls.h line 136 at r1 (raw file):

 *
 * \param      is_epid            True if EPID is used, false if DCAP/ECDSA is used.
 * \param[out] qe_targetinfo      Retrieved Quoting Enclave's target info; buffer must be submitted

I'd drop this note altogether, isn't it obvious? It's a single-level pointer, so there's no way for the function to allocate a buffer by itself and return it.

Suggestion:

buffer must be provided

pal/src/host/linux-sgx/enclave_ocalls.h line 141 at r1 (raw file):

 * \returns 0 on success, negative Linux error code otherwise.
 *
 * The obtained QE Target Info is not validated in any way (i.e., this function does not check

Suggestion:

this function does not check whether

pal/src/host/linux-sgx/enclave_ocalls.h line 144 at r1 (raw file):

 * Target Info contents make sense).
 */
int ocall_get_qe_targetinfo(bool is_epid, sgx_target_info_t* qe_targetinfo);

Double checking if I understand the security model here correctly: qe_targetinfo is completely untrusted, and thus we can't assume that we're talking directly to a real QE (we can be talking to a fake one, or be MitM-ed).
This is not a problem, because only the authentic QE has access to the keys to generate a valid quote (so, the fake QE can't), and our report targets whatever we got in qe_targetinfo and the MitMer can't re-encrypt it and then send it to a real QE (due to how EREPORT works).
Is this right?


pal/src/host/linux-sgx/enclave_ocalls.c line 2199 at r1 (raw file):

    if (!ocall_qe_ti_args) {
        sgx_reset_ustack(old_ustack);
        return -ENOMEM;

Almost all other places return -EPERM in this case.


pal/src/host/linux-sgx/enclave_platform.c line 8 at r1 (raw file):

/* this function assumes proper synchronization by callers, such that
 * g_pal_linuxsgx_state.qe_targetinfo is not racey */

This should be documented in the header, not here. Also, it's not only about calling this function multiple times concurrently, but also about anything else which could be reading g_pal_linuxsgx_state.qe_targetinfo at the same time.


pal/src/host/linux-sgx/enclave_platform.c line 8 at r1 (raw file):

/* this function assumes proper synchronization by callers, such that
 * g_pal_linuxsgx_state.qe_targetinfo is not racey */

Do we actually have that synchronization in place? Also, it's quite a surprising/risky interface. Why not just use a lock?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants