[Pal/Linux-SGX] DONTMERGE: Initial EDMM dynamic heap implementation #2190

vijaydhanraj · 2021-02-24T23:37:07Z

Description of the changes

Naive EDMM heap implementation (experimental). This is based on zhtlancer@4f2cfa0 and #234.

How to test this PR?

Add sgx.edmm_enable_heap = 1 to the mainfest file and run the simple helloworld program from regression tests.

This change is

yamahata

Reviewed 13 of 13 files at r1.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" found in commit messages' one-liners (waiting on @vijaydhanraj)

Pal/src/host/Linux-SGX/enclave_pages.c, line 313 at r1 (raw file):

out:
    _DkInternalUnlock(&g_heap_vma_lock);
    if (g_pal_sec.edmm_enable_heap && new_allocation && ret) {

new_ allocation doesn’t seem correct when addr is specified.
when addr isn’t specified, it’s size.

Pal/src/host/Linux-SGX/enclave_pages.c, line 401 at r1 (raw file):

    _DkInternalUnlock(&g_heap_vma_lock);
    if (ret >=0 && g_pal_sec.edmm_enable_heap) {
        if (free_edmm_page_range(addr, size) < 0)

This doesn’t seem correct as the region can be scattered.
or is it idenpotent?

Pal/src/host/Linux-SGX/sgx_main.c, line 460 at r1 (raw file):

        /* skip adding free (heap) pages to the enclave */
        if (enclave->pal_sec.edmm_enable_heap && !strcmp(areas[i].desc, "free")) {
            goto skip_add;

Please don’t use unnecessary goto. just negate if condition.

dimakuv

Reviewed 13 of 13 files at r1.
Reviewable status: all files reviewed, 48 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" found in commit messages' one-liners (waiting on @vijaydhanraj and @yamahata)

a discussion (no related file):
It looks like we have a big problem with multi-threading here... I actually don't know how to enable EDMM correctly in case of multi-threaded memory allocations. @vijaydhanraj Could you research this area a bit? Surely other memory allocators found good way of dealing with it.

I can imagine two ways:

Each thread has its own sub-region of enclave memory. Looks brittle (we don't have too much enclave memory).
All threads issue requests to some helper thread that performs all allocations (I think that's what SGX driver does). But this leads to a problem of a new helper thread...

Looks like a tough problem to solve.

Documentation/manifest-syntax.rst, line 379 at r1 (raw file):

This syntax enables EDMM dynamic heap feature available as part of Intel SGX2
capable hardware. By default the feature is disabled but when enabled, EPC

No reason to state again that the feature is disabled by default. Simply remove this part and start with When enabled, EPC....

Documentation/manifest-syntax.rst, line 380 at r1 (raw file):

This syntax enables EDMM dynamic heap feature available as part of Intel SGX2
capable hardware. By default the feature is disabled but when enabled, EPC
pages are not added when creating the enclave but allocated when an unavailable

unavailable sounds weird. Maybe allocated dynamically using EACCEPT when Graphene requests more heap memory. The EACCEPT triggers....

Documentation/manifest-syntax.rst, line 382 at r1 (raw file):

pages are not added when creating the enclave but allocated when an unavailable
page is EACCEPT'ed. This triggers a page fault (#PF) which is handled by the
Intel SGX driver (legacy driver) which EAUG's the page and is then EACCEPT'ed

...which EAUG's the page and returns the control back to the in-Graphene enclave,
which in turn continues from the same EACCEPT instruction (but this time this
instruction succeeds). This does help...

Documentation/manifest-syntax.rst, line 383 at r1 (raw file):

page is EACCEPT'ed. This triggers a page fault (#PF) which is handled by the
Intel SGX driver (legacy driver) which EAUG's the page and is then EACCEPT'ed
by the enclave. This does help reduce the loading time of a large enclave

help to reduce

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 14 at r1 (raw file):

#include "sgx_attest.h"

#include "gsgx.h"

Please remove this include from everywhere. See my other comment below on this.

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 121 at r1 (raw file):

                    const sgx_quote_nonce_t* nonce, char** quote, size_t* quote_len);

int ocall_trim_epc_pages(struct sgx_range* rg);

What is struct sgx_range? I guess it is defined in sgx.h of the legacy OOT driver?

I don't like the idea of using gsgx.h (that embeds sgx.h) for this. Could you copy-paste this struct inside sgx_arch.h (or some similar SGX-specific header in Graphene) and use this header?

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 121 at r1 (raw file):

                    const sgx_quote_nonce_t* nonce, char** quote, size_t* quote_len);

int ocall_trim_epc_pages(struct sgx_range* rg);

Also, what is rg? Let's use a proper name for it, e.g., enclave_pages_range or epc_pages_range.

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 123 at r1 (raw file):

int ocall_trim_epc_pages(struct sgx_range* rg);

int ocall_notify_accept(struct sgx_range* rg);

Please add a newline here.

Pal/src/host/Linux-SGX/enclave_ocalls.c, line 1708 at r1 (raw file):

int ocall_trim_epc_pages(struct sgx_range* rg) {
    int retval = 0;
    struct sgx_range* ms;

I am a bit torn here. Re-using struct sgx_range is a smart optimization, but I would still prefer to follow our convention in this file to declare OCALL-specific structs. So please add a new struct ms_ocall_sgx_range_t (with the same fields as sgx_range) and use it here internally.

Pal/src/host/Linux-SGX/enclave_ocalls.c, line 1710 at r1 (raw file):

    struct sgx_range* ms;

    void *old_ustack = sgx_prepare_ustack();

* is wrongly aligned

Pal/src/host/Linux-SGX/enclave_ocalls.c, line 1730 at r1 (raw file):

int ocall_notify_accept(struct sgx_range* rg) {
    int retval = 0;
    struct sgx_range* ms;

ditto

Pal/src/host/Linux-SGX/enclave_ocalls.c, line 1732 at r1 (raw file):

    struct sgx_range* ms;

    void *old_ustack = sgx_prepare_ustack();

ditto

Pal/src/host/Linux-SGX/enclave_pages.c, line 10 at r1 (raw file):

#include "pal_security.h"

#include "gsgx.h"

remove

Pal/src/host/Linux-SGX/enclave_pages.c, line 86 at r1 (raw file):

}

#define SE_DECLSPEC_ALIGN(x) __attribute__((aligned(x)))

You seem to need this because EACCEPT requires proper (64B) alignment. But I don't like that you're declaring a macro here. Let's just re-use already defined __sgx_mem_aligned (512 alignment). It's a bit of an overkill, but it's a more readable approach. And we use it everywhere else (see e.g. enclave_platform.c).

Pal/src/host/Linux-SGX/enclave_pages.c, line 87 at r1 (raw file):

#define SE_DECLSPEC_ALIGN(x) __attribute__((aligned(x)))
static int free_edmm_page_range(void *start, size_t size)

Please add some comments to this function describing the flows. You can even draw an ASCII diagram if you want to :)

Pal/src/host/Linux-SGX/enclave_pages.c, line 88 at r1 (raw file):

#define SE_DECLSPEC_ALIGN(x) __attribute__((aligned(x)))
static int free_edmm_page_range(void *start, size_t size)
{

Please move to previous line

Pal/src/host/Linux-SGX/enclave_pages.c, line 89 at r1 (raw file):

static int free_edmm_page_range(void *start, size_t size)
{
    void *addr = (void *)((uintptr_t)start & (~(PRESET_PAGESIZE-1)));

void *addr -> void* addr
(void *) -> (void*)

We have a special macro to align to the beginning of page: ALLOC_ALIGN_DOWN_PTR

Pal/src/host/Linux-SGX/enclave_pages.c, line 90 at r1 (raw file):

{
    void *addr = (void *)((uintptr_t)start & (~(PRESET_PAGESIZE-1)));
    void *end = addr + size;

Here and everywhere: fix alignment of *

Pal/src/host/Linux-SGX/enclave_pages.c, line 99 at r1 (raw file):

    memset(&si.reserved, 0, sizeof(si.reserved));

    rg.start_addr = (unsigned long) addr;

No space between case and variable

Pal/src/host/Linux-SGX/enclave_pages.c, line 100 at r1 (raw file):

    rg.start_addr = (unsigned long) addr;
    rg.nr_pages = size / PRESET_PAGESIZE;

Please don't use PRESET_PAGESIZE. We should use g_pal_state.alloc_align.

Pal/src/host/Linux-SGX/enclave_pages.c, line 102 at r1 (raw file):

    rg.nr_pages = size / PRESET_PAGESIZE;
    ret = ocall_trim_epc_pages(&rg);
    if (ret) {

Please replace all such occurrences with ret < 0

Pal/src/host/Linux-SGX/enclave_pages.c, line 104 at r1 (raw file):

    if (ret) {
        log_debug("EPC trim page on [%p, %p) failed (%d)\n", addr, end, ret);
        return -1;

Why -1? Simply return ret;

Pal/src/host/Linux-SGX/enclave_pages.c, line 111 at r1 (raw file):

        if (ret) {
            log_debug("EDMM accept page failed while trimming: %p %d\n", (void *)addr, ret);
            return -1;

Why -1? Simply return ret;

Pal/src/host/Linux-SGX/enclave_pages.c, line 118 at r1 (raw file):

    if (ret) {
        log_debug("EPC notify_accept on [%p, %p) failed (%d)\n", addr, end, ret);
        return -1;

Why -1? Simply return ret;

Pal/src/host/Linux-SGX/enclave_pages.c, line 125 at r1 (raw file):

static int get_edmm_page_range(void *start, size_t size, bool executable)
{

ditto

Pal/src/host/Linux-SGX/enclave_pages.c, line 130 at r1 (raw file):

    log_debug("%s: start = %lx, size = %lx, is_executable = %s\n", __func__, start, size,
               (executable) ? "TRUE" : "FALSE");
    SE_DECLSPEC_ALIGN(sizeof(sgx_arch_sec_info_t)) sgx_arch_sec_info_t si;

ditto

Pal/src/host/Linux-SGX/enclave_pages.c, line 136 at r1 (raw file):

    memset(&si.reserved, 0, sizeof(si.reserved));

    SE_DECLSPEC_ALIGN(sizeof(sgx_arch_sec_info_t)) sgx_arch_sec_info_t smi = si;

Please add a comment explaining why you need both si and smi.

Pal/src/host/Linux-SGX/enclave_pages.c, line 146 at r1 (raw file):

        if (ret) {
            log_debug("EDMM accept page failed: %p %d\n", (void *)addr, ret);
            return -1;

return ret

Pal/src/host/Linux-SGX/enclave_pages.c, line 149 at r1 (raw file):

        }

        /* EMODPE doesn't return any value */

This comment alone is not enough to understand what you're doing. Please expand.

Pal/src/host/Linux-SGX/enclave_pages.c, line 246 at r1 (raw file):

    assert(vma->top - vma->bottom >= (ptrdiff_t)freed);
    size_t allocated = vma->top - vma->bottom - freed;
    *new_allocation = allocated;

You seem to be using new_allocation as boolean, why did you declare it as size_t? Please make it boolean. Also, maybe add a comment why this variable is needed?

Pal/src/host/Linux-SGX/enclave_pages.c, line 313 at r1 (raw file):

out:
    _DkInternalUnlock(&g_heap_vma_lock);
    if (g_pal_sec.edmm_enable_heap && new_allocation && ret) {

Please make ret more explicit: ret != NULL (otherwise I thought initially that ret means "error code").

Pal/src/host/Linux-SGX/enclave_pages.c, line 313 at r1 (raw file):

out:
    _DkInternalUnlock(&g_heap_vma_lock);
    if (g_pal_sec.edmm_enable_heap && new_allocation && ret) {

Can you actually do it outside of the lock? Isn't it possible that while you're here, some other enclave thread decided to free this memory region, and you have a data race?

We need to think how to fix this multi-threading issue. I don't want to simply move this new logic under the g_heap_vma_lock because this logic goes out-and-in the enclave, so everything will become extremely slow. I can't come up with something good right now. Maybe @yamahata and @boryspoplawski and @mkow ?

Pal/src/host/Linux-SGX/enclave_pages.c, line 314 at r1 (raw file):

    _DkInternalUnlock(&g_heap_vma_lock);
    if (g_pal_sec.edmm_enable_heap && new_allocation && ret) {
        if (get_edmm_page_range(ret, size, 1) < 0)

I think this is wrong. It is possible that the [ret, size) memory region was partly allocated before. So you'll end up with EACCEPT on the same page. Or am I wrong?

Pal/src/host/Linux-SGX/enclave_pages.c, line 315 at r1 (raw file):

    if (g_pal_sec.edmm_enable_heap && new_allocation && ret) {
        if (get_edmm_page_range(ret, size, 1) < 0)
            ret = NULL;

Please add some comments. Also, could you end up in "half-allocated" state here?

Pal/src/host/Linux-SGX/enclave_pages.c, line 401 at r1 (raw file):

Previously, yamahata wrote…

This doesn’t seem correct as the region can be scattered.
or is it idenpotent?

Isaku is right. This is not correct.

Pal/src/host/Linux-SGX/pal_security.h, line 22 at r1 (raw file):

    sgx_measurement_t mr_signer;
    sgx_attributes_t  enclave_attributes;
    bool edmm_enable_heap;

Please move after heap_min, heap_max.

Pal/src/host/Linux-SGX/sgx_enclave.c, line 31 at r1 (raw file):

#include "sigset.h"

#include "gsgx.h"

Please remove this (and declare all the needed constants in sgx_arch.h or smth)

Pal/src/host/Linux-SGX/sgx_enclave.c, line 686 at r1 (raw file):

}

extern int g_isgx_device;

Please move inside the function, right before the statement where you use it.

Pal/src/host/Linux-SGX/sgx_enclave.c, line 688 at r1 (raw file):

extern int g_isgx_device;
static long sgx_ocall_trim_epc_pages(void *pms) {
    struct sgx_range *rg = (struct sgx_range *)pms;

All * alignments are wrong here :)

Pal/src/host/Linux-SGX/sgx_enclave.c, line 689 at r1 (raw file):

static long sgx_ocall_trim_epc_pages(void *pms) {
    struct sgx_range *rg = (struct sgx_range *)pms;
    ODEBUG(OCALL_EVENTFD, rg);

EVENTFD?

Pal/src/host/Linux-SGX/sgx_enclave.c, line 696 at r1 (raw file):

static long sgx_ocall_notify_accept(void *pms) {
    struct sgx_range *rg = (struct sgx_range *)pms;
    ODEBUG(OCALL_EVENTFD, rg);

EVENTFD?

Pal/src/host/Linux-SGX/sgx_framework.c, line 141 at r1 (raw file):

    }
#endif

Please keep this empty line

python/graphenelibos/sgx_sign.py, line 594 at r1 (raw file):

                              desc, flags)
        else:
            if edmm_enable_heap == 1 and (area.desc == "free"):

No need for parenthesis

python/graphenelibos/sgx_sign.py, line 595 at r1 (raw file):

        else:
            if edmm_enable_heap == 1 and (area.desc == "free"):
                continue

Please add a comment that you want to skip EADDing of heap ("free") pages when EDMM is enabled.

python/graphenelibos/sgx_sign.py, line 766 at r1 (raw file):

    print('    misc_select: %08x' % int.from_bytes(attr['misc_select'], byteorder='big'))
    print('    date:        %d-%02d-%02d' % (attr['year'], attr['month'], attr['day']))
    print("    edmm_heap:   %d" % (attr['edmm_enable_heap']))

Please use the same name edmm_enable_heap, to avoid confusion (and add more spaces everywhere here, for correct indentation)

mkow

Reviewable status: all files reviewed, 51 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @vijaydhanraj, and @yamahata)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

It looks like we have a big problem with multi-threading here... I actually don't know how to enable EDMM correctly in case of multi-threaded memory allocations. @vijaydhanraj Could you research this area a bit? Surely other memory allocators found good way of dealing with it.

I can imagine two ways:

Each thread has its own sub-region of enclave memory. Looks brittle (we don't have too much enclave memory).

All threads issue requests to some helper thread that performs all allocations (I think that's what SGX driver does). But this leads to a problem of a new helper thread...

Looks like a tough problem to solve.

I didn't read all the code yet, but I think I get the issue (if not, please correct me).

The first solution doesn't scale, I don't think it's acceptable.
The second sounds a bit PITA.

What about having a global (thread safe) "registry" of pages which have pending EDMM changes? If a thread would want to modify something which is already waiting for a modification issued by another thread then it would need to wait on this registry and continue once this particular page is changeable again.

Documentation/manifest-syntax.rst, line 379 at r1 (raw file):

Intel SGX2 capable hardware

Please link to sgx-intro.html#term-sgx2.

Documentation/manifest-syntax.rst, line 381 at r1 (raw file):

capable hardware. By default the feature is disabled but when enabled, EPC
pages are not added when creating the enclave but allocated when an unavailable
page is EACCEPT'ed. This triggers a page fault (#PF) which is handled by the

EACCEPT'ed -> EACCEPTed or EACCEPT-ed (ditto for other places)

Documentation/manifest-syntax.rst, line 385 at r1 (raw file):

This does help reduce the loading time of a large enclave application but [...]

I think the biggest advantage of EDMM is that you pay the price only for the pages you actually used. So, you can say e.g. that your enclave will have 128 GB, even if you won't use most of it, i.e. you don't have to tailor the size precisely for each workload.

Pal/src/host/Linux-SGX/enclave_pages.c, line 136 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add a comment explaining why you need both si and smi.

Actually, what this smi name means? Maybe you could use a bit longer, more descriptive names?

dimakuv

Reviewable status: all files reviewed, 51 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @vijaydhanraj, and @yamahata)

a discussion (no related file):

What about having a global (thread safe) "registry" of pages which have pending EDMM changes? If a thread would want to modify something which is already waiting for a modification issued by another thread then it would need to wait on this registry and continue once this particular page is changeable again.

This sounds like we'll need futexes inside the enclave's memory allocator (in enclave_pages.c)? Looks like a PITA itself. Also, why would this be better than simply moving the EACCEPT logic under the g_heap_vma_lock global spinlock? (Rather than spinlock vs futex argument, but futexes in the enclave is also a bad idea).

dimakuv

Reviewable status: all files reviewed, 51 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

Documentation/manifest-syntax.rst, line 385 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

This does help reduce the loading time of a large enclave application but [...]
I think the biggest advantage of EDMM is that you pay the price only for the pages you actually used. So, you can say e.g. that your enclave will have 128 GB, even if you won't use most of it, i.e. you don't have to tailor the size precisely for each workload.

Agree with Michal, we should also highlight this.

mkow

Reviewable status: all files reviewed, 51 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

a discussion (no related file):

sounds like we'll need futexes inside the enclave's memory allocator (in enclave_pages.c)?

Why is this a problem? Our memory management requires locks anyways.

Also, why would this be better than simply moving the EACCEPT logic under the g_heap_vma_lock global spinlock? (Rather than spinlock vs futex argument, but futexes in the enclave is also a bad idea).

Sounds like a potential bottleneck for a case where a lot of threads are allocating memory?

dimakuv

Reviewable status: all files reviewed, 51 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

a discussion (no related file):

Why is this a problem? Our memory management requires locks anyways.

We currently use spinlocks everywhere in the lowest level of trusted PAL. The only place where futexes are used is the Exitless feature. My point is that I'm not sure our code is ready for futexes, and it will be pretty complex to implement correctly and catch all the segfault bugs due to data races. But I don't know, probably there is no other way but futexes.

Sounds like a potential bottleneck for a case where a lot of threads are allocating memory?

Yes, this could be a perf issue. On the other hand, the current implementation of EDMM is already slow as hell. And I don't think that real-world applications (and memory managers) frequently do this "a lot of threads allocating memory" pattern. My assumption would be that apps typically pre-allocate a bunch of memory regions (one for each thread), and then draw from this pre-allocated pool.

mkow

Reviewable status: all files reviewed, 51 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

a discussion (no related file):

My assumption would be that apps typically pre-allocate a bunch of memory regions (one for each thread)

Yeah, but don't they do this at the same time, usually?

vijaydhanraj

Reviewable status: all files reviewed, 51 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, @mkow, and @yamahata)

a discussion (no related file):

Previously, mkow (Michał Kowalczyk) wrote…

My assumption would be that apps typically pre-allocate a bunch of memory regions (one for each thread)

Yeah, but don't they do this at the same time, usually?

For initial analysis, I have moved the EACCEPT and EREMOVE logic under the g_heap_vma_lock lock temporarily. Looking into other memory allocator designs and will address this issue before the actual PR

Documentation/manifest-syntax.rst, line 379 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

No reason to state again that the feature is disabled by default. Simply remove this part and start with When enabled, EPC....

Done.

Documentation/manifest-syntax.rst, line 379 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Intel SGX2 capable hardware
Please link to sgx-intro.html#term-sgx2.

Done.

Documentation/manifest-syntax.rst, line 380 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

unavailable sounds weird. Maybe allocated dynamically using EACCEPT when Graphene requests more heap memory. The EACCEPT triggers....

Done.

Documentation/manifest-syntax.rst, line 381 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

EACCEPT'ed -> EACCEPTed or EACCEPT-ed (ditto for other places)

Reworded so this doesn't apply anymore but will follow the convention when using at other places.

Documentation/manifest-syntax.rst, line 382 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

...which EAUG's the page and returns the control back to the in-Graphene enclave,
which in turn continues from the same EACCEPT instruction (but this time this
instruction succeeds). This does help...

Done, just removed in-Graphene as we simply use enclave elsewhere in the text.

Documentation/manifest-syntax.rst, line 383 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

help to reduce

Done.

Documentation/manifest-syntax.rst, line 385 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Agree with Michal, we should also highlight this.

yes added this info as well.

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 14 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please remove this include from everywhere. See my other comment below on this.

Yes removed from enclave_ocalls.h and enclave_pages.c but need need this header file in sgx_enclave.c as we call the trim/notify_accept IOCTLs here.

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 121 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

What is struct sgx_range? I guess it is defined in sgx.h of the legacy OOT driver?

I don't like the idea of using gsgx.h (that embeds sgx.h) for this. Could you copy-paste this struct inside sgx_arch.h (or some similar SGX-specific header in Graphene) and use this header?

struct sgx_range is a sgx2 specific struct to provide a range of addresses to trim/remove. Yes, this is defined in sgx_user.h struct of the legacy OOT driver.

Based on @dimakuv's comment, I have created an OCALL-specific struct ms_ocall_sgx_range_twhich removes the need to duplicate struct sgx_range and also can remove the gsgx.h file.

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 121 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Also, what is rg? Let's use a proper name for it, e.g., enclave_pages_range or epc_pages_range.

rg was short for range. But this is removed after introducing OCALL-specific struct ms_ocall_sgx_range_t

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 123 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add a newline here.

Done.

Pal/src/host/Linux-SGX/enclave_ocalls.c, line 1708 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I am a bit torn here. Re-using struct sgx_range is a smart optimization, but I would still prefer to follow our convention in this file to declare OCALL-specific structs. So please add a new struct ms_ocall_sgx_range_t (with the same fields as sgx_range) and use it here internally.

yes, added OCALL-specific struct ms_ocall_sgx_range_t

Pal/src/host/Linux-SGX/enclave_ocalls.c, line 1710 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

* is wrongly aligned

Done.

Pal/src/host/Linux-SGX/enclave_ocalls.c, line 1730 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto

Done.

Pal/src/host/Linux-SGX/enclave_ocalls.c, line 1732 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 10 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

remove

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 86 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You seem to need this because EACCEPT requires proper (64B) alignment. But I don't like that you're declaring a macro here. Let's just re-use already defined __sgx_mem_aligned (512 alignment). It's a bit of an overkill, but it's a more readable approach. And we use it everywhere else (see e.g. enclave_platform.c).

Removed #define SE_DECLSPEC_ALIGN(x) __attribute__((aligned(x))) and used __sgx_mem_aligned. But why do we want to use a 512B alignment when 64B is the requirement?

Pal/src/host/Linux-SGX/enclave_pages.c, line 87 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add some comments to this function describing the flows. You can even draw an ASCII diagram if you want to :)

Added a detailed explanation of the function.

Pal/src/host/Linux-SGX/enclave_pages.c, line 88 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please move to previous line

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 89 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

void *addr -> void* addr
(void *) -> (void*)

We have a special macro to align to the beginning of page: ALLOC_ALIGN_DOWN_PTR

Thanks, used this macro and also took care of the * misalignment.

Pal/src/host/Linux-SGX/enclave_pages.c, line 90 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Here and everywhere: fix alignment of *

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 99 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

No space between case and variable

This code is removed but will keep an eye out for other typecasting variables.

Pal/src/host/Linux-SGX/enclave_pages.c, line 100 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please don't use PRESET_PAGESIZE. We should use g_pal_state.alloc_align.

Done, but this name g_pal_state.alloc_align is a little unclear.

Pal/src/host/Linux-SGX/enclave_pages.c, line 102 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please replace all such occurrences with ret < 0

That will not be correct as sgx_accept returns a positive value as an error code. For example, it can return SGX_PAGE_ATTRIBUTES_MISMATCH which is 19. So unified all returns to -1.

But now I have just changed ocall_* functions to return the value returned and check for ret < 0

Pal/src/host/Linux-SGX/enclave_pages.c, line 104 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why -1? Simply return ret;

please refer to the earlier comment.

Pal/src/host/Linux-SGX/enclave_pages.c, line 111 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why -1? Simply return ret;

ditto

Pal/src/host/Linux-SGX/enclave_pages.c, line 118 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why -1? Simply return ret;

ditto

Pal/src/host/Linux-SGX/enclave_pages.c, line 125 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 130 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 136 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Actually, what this smi name means? Maybe you could use a bit longer, more descriptive names?

Renamed si and smi to secinfo and secinfo_extend. Original intention was si meant secinfo struct and smi meant secinfo modified/extended :)

using two variables as we don't want to add SGX_SECINFO_FLAGS_X and clear it in the next iteration for sgx_accept. Do we want a comment here for this? I have added comments explaining other things which will make these declarations clear.

Pal/src/host/Linux-SGX/enclave_pages.c, line 146 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

return ret

return from sgx_accept is a positive number, so setting this explicitly to -1

Pal/src/host/Linux-SGX/enclave_pages.c, line 149 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This comment alone is not enough to understand what you're doing. Please expand.

yes, have explained in detail. Hope this helps.

Pal/src/host/Linux-SGX/enclave_pages.c, line 246 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You seem to be using new_allocation as boolean, why did you declare it as size_t? Please make it boolean. Also, maybe add a comment why this variable is needed?

The logic around new_allocation is modified to take into account free spaces while between VMAs while merging vma_above. Now this variable is not used.

But I have added comments here for the new logic.

Pal/src/host/Linux-SGX/enclave_pages.c, line 313 at r1 (raw file):

Previously, yamahata wrote…

new_ allocation doesn’t seem correct when addr is specified.
when addr isn’t specified, it’s size.

@yamahata the logic around new_allocation is modified to take into account free spaces between VMAs while merging vma_above. Please take a look and see if this addresses your concern.

Pal/src/host/Linux-SGX/enclave_pages.c, line 313 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please make ret more explicit: ret != NULL (otherwise I thought initially that ret means "error code").

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 313 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can you actually do it outside of the lock? Isn't it possible that while you're here, some other enclave thread decided to free this memory region, and you have a data race?

We need to think how to fix this multi-threading issue. I don't want to simply move this new logic under the g_heap_vma_lock because this logic goes out-and-in the enclave, so everything will become extremely slow. I can't come up with something good right now. Maybe @yamahata and @boryspoplawski and @mkow ?

yes, I have moved the allocation and deallocation inside the g_heap_vma_lock temporarily. Looking at other memory allocator designs to see how to address the bottleneck for multi-threaded scenarios.

Pal/src/host/Linux-SGX/enclave_pages.c, line 314 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I think this is wrong. It is possible that the [ret, size) memory region was partly allocated before. So you'll end up with EACCEPT on the same page. Or am I wrong?

yes pls see the earlier comment.

Pal/src/host/Linux-SGX/enclave_pages.c, line 315 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add some comments. Also, could you end up in "half-allocated" state here?

yes, this could happen when we new addr + size space multiple VMAs having free spaces between them. I have addressed this by tracking free spaces between VMAs while walking the g_heap_vma_list and only call EACCEPT on those ranges.

Pal/src/host/Linux-SGX/enclave_pages.c, line 401 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Isaku is right. This is not correct.

At least on my test runs with entire LibOS regression tests, I didn't encounter non-continuous regions but I have implemented a simple logic that tracks the free ranges while parsing the VMAs and frees them.

Pal/src/host/Linux-SGX/pal_security.h, line 22 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please move after heap_min, heap_max.

Done

Pal/src/host/Linux-SGX/sgx_enclave.c, line 31 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please remove this (and declare all the needed constants in sgx_arch.h or smth)

Moving this to sgx_arch.h is little tricky because of redeclaration conflicts that arise from pal\src\host\Linux-SGX\generated-offsets.c file. Here we have both sgx_*.h files as part of pal_linux.h as well as gsgx.h.

The reason we have it here is because of IOCTL calls to the sgx driver very similar to how we have this header in sgx_framework.c

Pal/src/host/Linux-SGX/sgx_enclave.c, line 686 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please move inside the function, right before the statement where you use it.

Done.

Pal/src/host/Linux-SGX/sgx_enclave.c, line 688 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

All * alignments are wrong here :)

Done.

Pal/src/host/Linux-SGX/sgx_enclave.c, line 689 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

EVENTFD?

Copy-paste error, fixed it.

Pal/src/host/Linux-SGX/sgx_enclave.c, line 696 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

EVENTFD?

Done.

Pal/src/host/Linux-SGX/sgx_framework.c, line 141 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please keep this empty line

Done.

Pal/src/host/Linux-SGX/sgx_main.c, line 460 at r1 (raw file):

Previously, yamahata wrote…

Please don’t use unnecessary goto. just negate if condition.

Not sure but I don't think a simple negation would work here as we only need to skip adding pages only when EDMM is active and the pages are free. But I have removed goto by using an if and else statement.

python/graphenelibos/sgx_sign.py, line 594 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

No need for parenthesis

Done.

python/graphenelibos/sgx_sign.py, line 595 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add a comment that you want to skip EADDing of heap ("free") pages when EDMM is enabled.

Done.

python/graphenelibos/sgx_sign.py, line 766 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please use the same name edmm_enable_heap, to avoid confusion (and add more spaces everywhere here, for correct indentation)

Done.

dimakuv

Reviewed 13 of 13 files at r2.
Reviewable status: all files reviewed, 43 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @mkow, @vijaydhanraj, and @yamahata)

a discussion (no related file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

For initial analysis, I have moved the EACCEPT and EREMOVE logic under the g_heap_vma_lock lock temporarily. Looking into other memory allocator designs and will address this issue before the actual PR

Sounds good, Vijay!

a discussion (no related file):

default default signal handler.

Typo in the commit message (two times default)

LibOS/shim/test/regression/mmap_file.c, line 1 at r2 (raw file):

#define _GNU_SOURCE

Borys would love this test :) I think we should rewrite this whole file (or burn it).

LibOS/shim/test/regression/test_libos.py, line 402 at r2 (raw file):

        'On SGX, SIGBUS isn\'t always implemented correctly, for lack '
        'of memory protection. For now, some of these cases won\'t work.')
    def test_051_mmap_sgx(self):

The name is weird now: you have _sgx in the name, but it is actually skipped on SGX. So maybe just keep the old name, _mmap?

Pal/src/host/Linux-SGX/db_main.c, line 713 at r2 (raw file):

        ocall_exit(1, true);
    }
    if (!g_pal_sec.edmm_enable_heap && preheat_enclave == 1) {

This is a good catch, but I would also print a log_warning in this case. Something like:

if (g_pal_sec.edmm_enable_heap && preheat_enclave) {
    log_warning("EDMM ('sgx.edmm_enable_heap') and preheat-enclave ('sgx.preheat_enclave') are both enabled"
           " but they cannot be combined. Graphene will use EDMM and disable preheat-enclave now.");
}

Something like this...

Pal/src/host/Linux-SGX/enclave_pages.c, line 86 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Removed #define SE_DECLSPEC_ALIGN(x) __attribute__((aligned(x))) and used __sgx_mem_aligned. But why do we want to use a 512B alignment when 64B is the requirement?

For simplicity. But sure, you can add a new macro __sgx_64byte_aligned or something like this. Feel free to add it if you feel this is important.

I just didn't like the declaration of the macro in this file.

Pal/src/host/Linux-SGX/enclave_pages.c, line 100 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Done, but this name g_pal_state.alloc_align is a little unclear.

@mkow knows better why we have such a name. I think this has to do something with Windows.

Pal/src/host/Linux-SGX/enclave_pages.c, line 102 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

That will not be correct as sgx_accept returns a positive value as an error code. For example, it can return SGX_PAGE_ATTRIBUTES_MISMATCH which is 19. So unified all returns to -1.

But now I have just changed ocall_* functions to return the value returned and check for ret < 0

Thanks. I didn't realize that.

Pal/src/host/Linux-SGX/enclave_pages.c, line 136 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Renamed si and smi to secinfo and secinfo_extend. Original intention was si meant secinfo struct and smi meant secinfo modified/extended :)

using two variables as we don't want to add SGX_SECINFO_FLAGS_X and clear it in the next iteration for sgx_accept. Do we want a comment here for this? I have added comments explaining other things which will make these declarations clear.

Thanks, I think everything is clear now.

Pal/src/host/Linux-SGX/enclave_pages.c, line 91 at r2 (raw file):

}

/* This function trims an EPC page on enclave's request. The sequence is as below,

Why do you say an EPC page? You can submit several pages here. Please fix the description.

Pal/src/host/Linux-SGX/enclave_pages.c, line 93 at r2 (raw file):

/* This function trims an EPC page on enclave's request. The sequence is as below,
 * 1. Enclave calls SGX driver IOCTL to change the page's type to PT_TRIM.
 * 2. In turn driver invokes ETRACK to track page's address on all processors and issues IPI to flush

In turn is not needed here, just remove it.

Also, I would prefer to replace processors with CPUs

Pal/src/host/Linux-SGX/enclave_pages.c, line 96 at r2 (raw file):

 * stale TLB entries.
 * 3. Enclave issues an EACCEPT to accept changes to the EPC page.
 * 4. Notifies the driver to remove EPC page which issues EREMOVE inst to complete the request. */

Enclave notifies...

Pal/src/host/Linux-SGX/enclave_pages.c, line 96 at r2 (raw file):

 * stale TLB entries.
 * 3. Enclave issues an EACCEPT to accept changes to the EPC page.
 * 4. Notifies the driver to remove EPC page which issues EREMOVE inst to complete the request. */

I would separate driver's EREMOVE in a separate item:

4. Enclave notifies the driver to remove the EPC page (using an IOCTL).
5. Driver issues EREMOVE to complete the request (maybe lazily).

Pal/src/host/Linux-SGX/enclave_pages.c, line 107 at r2 (raw file):

    memset(&secinfo.reserved, 0, sizeof(secinfo.reserved));

    unsigned int nr_pages = size / g_pal_state.alloc_align;

Do we guarantee that size is a multiple of alloc_align? For example, could this function receive size = 8? Then we'll have nr_pages = 0 and the logic breaks...

I'm fine with an assert(size % g_pal_state.alloc_align == 0). But maybe you want to ALLOC_ALIGN_UP(size). I don't know which one is a better fit here.

Pal/src/host/Linux-SGX/enclave_pages.c, line 114 at r2 (raw file):

    }

    for (void* page_addr = addr; page_addr < end; page_addr += g_pal_state.alloc_align) {

void* -> char* (it is technically disallowed to do pointer arithmetic on void* so we should always use char* for "modifiable" pointers)

Pal/src/host/Linux-SGX/enclave_pages.c, line 124 at r2 (raw file):

    ret = ocall_notify_accept(addr, nr_pages);
    if (ret < 0) {
        log_debug("EPC notify_accept on [%p, %p), %d pages failed (%d)\n", addr, end, nr_pages, ret);

Please unify the message format -- in trim_epc_pages above you don't have this %d pages.

Pal/src/host/Linux-SGX/enclave_pages.c, line 131 at r2 (raw file):

}

/* This function allocates a new page at an address in ELRANGE of an enclave. If the page contains

Why a new page? We may submit several pages to this function. Please fix the description.

Pal/src/host/Linux-SGX/enclave_pages.c, line 133 at r2 (raw file):

/* This function allocates a new page at an address in ELRANGE of an enclave. If the page contains
 * executable code, the page permissions are extended once the page is in a valid state. The
 * allocation sequence is described below

Add : at the end

Pal/src/host/Linux-SGX/enclave_pages.c, line 134 at r2 (raw file):

 * executable code, the page permissions are extended once the page is in a valid state. The
 * allocation sequence is described below
 * 1. Enclave invokes EACCEPT on the new page request which triggers a page fault(#PF) as the page

page fault(#PF) -> page fault (#PF)

Pal/src/host/Linux-SGX/enclave_pages.c, line 136 at r2 (raw file):

 * 1. Enclave invokes EACCEPT on the new page request which triggers a page fault(#PF) as the page
 * is not available yet.
 * 2. Driver catches this #PF and issues EAUG for the page. The control returns back to enclave.

EAUG for the page (at this point the page becomes VALID and may be used by the enclave).

Pal/src/host/Linux-SGX/enclave_pages.c, line 140 at r2 (raw file):

static int get_edmm_page_range(void* start, size_t size, bool executable) {
    uintptr_t lo = (uintptr_t)start;
    uintptr_t addr = lo + size;

Do we guarantee that size is a multiple of alloc_align? For example, could this function receive size = 8? Then we'll have weird arithmetic onaddr and the logic breaks...

I'm fine with an assert(size % g_pal_state.alloc_align == 0). But maybe you want to ALLOC_ALIGN_UP(size). I don't know which one is a better fit here.

Actually, the same for start.

Pal/src/host/Linux-SGX/enclave_pages.c, line 140 at r2 (raw file):

static int get_edmm_page_range(void* start, size_t size, bool executable) {
    uintptr_t lo = (uintptr_t)start;
    uintptr_t addr = lo + size;

Why do you use uintptr_t in this function, but you used void*/char* in the other function? I think you should use void*/char* in here.

Pal/src/host/Linux-SGX/enclave_pages.c, line 163 at r2 (raw file):

         * permissions will have no effect.
         * Note: Page becomes valid only after EUG which will be done as part of previous sgx_accept
         * call. */

Given that you have a detailed explanation of the flows in the function top-comment, this feels redundant. I suggest to keep only the first sentence and remove all the rest of this comment:

        /* All new pages have RW permissions initially, so after EAUG/EACCEPT, extend
         * permissions of a VALID enclave page (if needed). */

Pal/src/host/Linux-SGX/enclave_pages.c, line 177 at r2 (raw file):

static void* __create_vma_and_merge(void* addr, size_t size, bool is_pal_internal,
                                    struct heap_vma* vma_above,
                                    struct edmm_heap_range* unallocated_heap) {

Please add a comment to this function that unallocated_heap is an array (of size 64 entries) that tracks the freed memory regions, and that this function updates this variable with addr, size entries.

Pal/src/host/Linux-SGX/enclave_pages.c, line 177 at r2 (raw file):

static void* __create_vma_and_merge(void* addr, size_t size, bool is_pal_internal,
                                    struct heap_vma* vma_above,
                                    struct edmm_heap_range* unallocated_heap) {

Also, the name is weird. I guess you mean something like "enclave memory regions which were not yet allocated", so maybe heap_ranges_to_alloc?

Pal/src/host/Linux-SGX/enclave_pages.c, line 240 at r2 (raw file):

         * the condition `vma_below->top >= vma->bottom` */
        if (g_pal_sec.edmm_enable_heap && vma_current && addr < vma_current->top) {
            int64_t free_size = vma_above->bottom - vma_current->top;

Why int64_t? It should be size_t. To check for overflow, just move the assert before assignment:

assert(vma_above->bottom <= vma_current->top);
size_t free_size = vma_above->bottom - vma_current->top;

Pal/src/host/Linux-SGX/enclave_pages.c, line 245 at r2 (raw file):

                unallocated_heap[free_cnt].size = free_size;
                unallocated_heap[free_cnt].addr = vma_current->top;
                free_cnt++;

Please add assert(free_cnt <= EDMM_HEAP_RANGE_CNT).

Pal/src/host/Linux-SGX/enclave_pages.c, line 249 at r2 (raw file):

                          __func__, vma_current->top, free_size);
            }
        }

Why do you need to keep track of vma_current?

Isn't it enough to use vma_above->top and vma_above->bottom here?

Pal/src/host/Linux-SGX/enclave_pages.c, line 285 at r2 (raw file):

    assert(vma->top - vma->bottom >= (ptrdiff_t)freed);
    size_t allocated = vma->top - vma->bottom - freed;
    /* No free space between VMAs found */

Is this comment correct? I think what really happens is there were some VMA merges, and this may have led to overlaps with previously-allocated pages, so these overlapping pages were marked as "freed". And you want to track those page ranges that were marked as "freed". And if there were no "freed" pages, it means that there were no overlaps (and thus no previously-allocated pages), and all VMA pages must be EACCEPTed.

Pal/src/host/Linux-SGX/enclave_pages.c, line 286 at r2 (raw file):

    size_t allocated = vma->top - vma->bottom - freed;
    /* No free space between VMAs found */
    if (g_pal_sec.edmm_enable_heap && free_cnt == 0 && allocated > 0) {

Why do you care about free_cnt and allocated? Shouldn't you only check for freed == 0?

Pal/src/host/Linux-SGX/enclave_pages.c, line 289 at r2 (raw file):

        unallocated_heap[0].size = size;
        unallocated_heap[0].addr = addr;
    }

This whole logic with unallocated_heap is confusing to me. Please explain the logic, and we'll see how to do re-write it in a more readable way.

Pal/src/host/Linux-SGX/enclave_pages.c, line 304 at r2 (raw file):

    void* ret = NULL;
    /* TODO: Should we introduce a compiler switch for EDMM? */
    struct edmm_heap_range unallocated_heap[EDMM_HEAP_RANGE_CNT] = {0};

Actually, why do you want to collect the yet-not-allocated page ranges first and then perform get_edmm_page_range() on these ranges? Why not just embed get_edmm_page_range() inside __create_vma_and_merge()?

We currently perform everything under the same global lock anyway. So there won't be any difference in performance, but it will significantly simplify the existing code. And in the future, we may change that (but let's not overcomplicate it now).

Pal/src/host/Linux-SGX/enclave_pages.c, line 365 at r2 (raw file):

                       unallocated_heap[cnt].addr, unallocated_heap[cnt].size);
            if (unallocated_heap[cnt].size > 0 &&
                get_edmm_page_range(unallocated_heap[cnt].addr, unallocated_heap[cnt].size, 1) < 0) {

Please don't do this pattern of "check and execute inside the IF statement". Extract get_edmm_page_range() on its own line.

Pal/src/host/Linux-SGX/enclave_pages.c, line 438 at r2 (raw file):

                edmm_free_cnt++;
            }
        }

I suggest to embed free_edmm_page_range() directly in here, instead of tracking.

Pal/src/host/Linux-SGX/ocall_types.h, line 316 at r2 (raw file):

typedef struct {
	unsigned long start_addr;

Why not void*?

Pal/src/host/Linux-SGX/ocall_types.h, line 317 at r2 (raw file):

typedef struct {
	unsigned long start_addr;
	unsigned int nr_pages;

Please spaces instead of tabs.

Pal/src/host/Linux-SGX/sgx_main.c, line 458 at r2 (raw file):

        }

        /* skip adding free (heap) pages to the enclave */

... if EDMM is enabled

Pal/src/host/Linux-SGX/sgx_main.c, line 468 at r2 (raw file):

                if (areas[i].prot & PROT_EXEC)
                    p[2] = 'X';
            }

Now we have this code copy-pasted in several places. Can you extract this code in a separate func prot_flags_to_permissions_str()?

python/graphenelibos/sgx_sign.py, line 594 at r2 (raw file):

                              desc, flags)
        else:
            # Skip EADDing of heap("free") pages when EDMM is enabled.

heap ("free")

mkow

Reviewed 6 of 13 files at r2.
Reviewable status: all files reviewed, 45 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @vijaydhanraj, and @yamahata)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Sounds good, Vijay!

ack, good luck then!

Documentation/manifest-syntax.rst, line 384 at r2 (raw file):

but this time this instruction succeeds

This should sound better: "but this time the instruction succeeds"

Pal/src/host/Linux-SGX/db_main.c, line 713 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This is a good catch, but I would also print a log_warning in this case. Something like:

if (g_pal_sec.edmm_enable_heap && preheat_enclave) {
    log_warning("EDMM ('sgx.edmm_enable_heap') and preheat-enclave ('sgx.preheat_enclave') are both enabled"
           " but they cannot be combined. Graphene will use EDMM and disable preheat-enclave now.");
}

Something like this...

+1, modulo alignment, in the second line " should be aligned to the previous " ;)

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 119 at r2 (raw file):

                    const sgx_quote_nonce_t* nonce, char** quote, size_t* quote_len);

int ocall_trim_epc_pages(void* addr, unsigned int nr_pages);

should be size_t

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 121 at r2 (raw file):

int ocall_trim_epc_pages(void* addr, unsigned int nr_pages);

int ocall_notify_accept(void* addr, unsigned int nr_pages);

ditto

Pal/src/host/Linux-SGX/enclave_pages.c, line 100 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

@mkow knows better why we have such a name. I think this has to do something with Windows.

We're inside Linux-specific PAL, so we can't just use PAGE_SIZE I think.

Pal/src/host/Linux-SGX/enclave_pages.c, line 96 at r2 (raw file):

 * stale TLB entries.
 * 3. Enclave issues an EACCEPT to accept changes to the EPC page.
 * 4. Notifies the driver to remove EPC page which issues EREMOVE inst to complete the request. */

inst
please use either insn or instr, there are many words which start with inst (e.g. "instance" or "installation")

Pal/src/host/Linux-SGX/sgx_api.h, line 57 at r2 (raw file):

 * \brief Low-level wrapper around EACCEPT instruction leaf.
 *
 * Caller is responsible for parameter alignment: 64B for `si` and 4KB(page size) for `addr`.

space before (, ditto for the whole PR

mkow

Reviewable status: all files reviewed, 46 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/db_main.c, line 713 at r2 (raw file):

        ocall_exit(1, true);
    }
    if (!g_pal_sec.edmm_enable_heap && preheat_enclave == 1) {

Actually, now that I think of this, I think we should still have this option with EDMM, but slightly different - something like "sgx.preallocation_size", which would preallocate and fault a specified amount of heap at the beginning (and we could default it to e.g. 32MB), and only after you cross this amount the EDMM triggers.
What do you think? This way you could benefit from both EDMM and preheating.

yamahata

Reviewable status: all files reviewed, 45 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/db_main.c, line 713 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Actually, now that I think of this, I think we should still have this option with EDMM, but slightly different - something like "sgx.preallocation_size", which would preallocate and fault a specified amount of heap at the beginning (and we could default it to e.g. 32MB), and only after you cross this amount the EDMM triggers.
What do you think? This way you could benefit from both EDMM and preheating.

sounds good idea.

vijaydhanraj

Reviewable status: all files reviewed, 45 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

default default signal handler.

Typo in the commit message (two times default)

yes fixed it.

Documentation/manifest-syntax.rst, line 384 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

but this time this instruction succeeds
This should sound better: "but this time the instruction succeeds"

Done.

LibOS/shim/test/regression/test_libos.py, line 402 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

The name is weird now: you have _sgx in the name, but it is actually skipped on SGX. So maybe just keep the old name, _mmap?

yes agree, have removed _sgx from the test name.

Pal/src/host/Linux-SGX/db_main.c, line 713 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

+1, modulo alignment, in the second line " should be aligned to the previous " ;)

Modified the message a little bit as we want to try combining both preheat_enclave as well as EDMM.

        log_warning("EDMM ('sgx.edmm_enable_heap') and preheat-enclave ('sgx.preheat_enclave') are"
                    " both enabled. Graphene will use EDMM only for the region excluded by"
                    " preheat-enclave size.");

Pal/src/host/Linux-SGX/db_main.c, line 713 at r2 (raw file):

Previously, yamahata wrote…

sounds good idea.

yes, that is a good idea @mkow. I plan to try this optimization. The tricky part will be figuring out the default preheat_enclave size to balance between optimal enclave loading time and the no. of AEXs due to page faults.

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 119 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

should be size_t

SGX driver expects unsigned int for nr_pages and so I have used it. Using size_t might expand the size to 8 bytes on x64 systems.

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 121 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

ditto

pls see above.

Pal/src/host/Linux-SGX/enclave_pages.c, line 86 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

For simplicity. But sure, you can add a new macro __sgx_64byte_aligned or something like this. Feel free to add it if you feel this is important.

I just didn't like the declaration of the macro in this file.

Ok, is it fine to have the generic version #define _DECLSPEC_ALIGN(x) __attribute__((aligned(x))) in sgx_arch.h

Pal/src/host/Linux-SGX/enclave_pages.c, line 100 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

We're inside Linux-specific PAL, so we can't just use PAGE_SIZE I think.

Ok, thanks for the clarification.

Pal/src/host/Linux-SGX/enclave_pages.c, line 91 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do you say an EPC page? You can submit several pages here. Please fix the description.

I have reworded to trims EPC pages on enclave's request

Pal/src/host/Linux-SGX/enclave_pages.c, line 93 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

In turn is not needed here, just remove it.

Also, I would prefer to replace processors with CPUs

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 96 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Enclave notifies...

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 96 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I would separate driver's EREMOVE in a separate item:

4. Enclave notifies the driver to remove the EPC page (using an IOCTL).
5. Driver issues EREMOVE to complete the request (maybe lazily).

Ok, done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 96 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

inst
please use either insn or instr, there are many words which start with inst (e.g. "instance" or "installation")

I have removed inst itself here as we don't use it for other instructions. But will keep the abbreviation in mind for future usage.

Pal/src/host/Linux-SGX/enclave_pages.c, line 107 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Do we guarantee that size is a multiple of alloc_align? For example, could this function receive size = 8? Then we'll have nr_pages = 0 and the logic breaks...

I'm fine with an assert(size % g_pal_state.alloc_align == 0). But maybe you want to ALLOC_ALIGN_UP(size). I don't know which one is a better fit here.

yes, it is taken care in free_enclave_pages() which calls free_edmm_page_range()

Pal/src/host/Linux-SGX/enclave_pages.c, line 114 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

void* -> char* (it is technically disallowed to do pointer arithmetic on void* so we should always use char* for "modifiable" pointers)

gcc didn't seem to complain but looks like other compilers like MSVC might have an issue. I now typecast to char* for arithmetic operation and then recast to void*.

Pal/src/host/Linux-SGX/enclave_pages.c, line 124 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please unify the message format -- in trim_epc_pages above you don't have this %d pages.

yes have reworded the text, please let me know if this helps.

Pal/src/host/Linux-SGX/enclave_pages.c, line 131 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why a new page? We may submit several pages to this function. Please fix the description.

Although this function takes a range (addr, size) internally we allocate at page granularity so the text. But I see your point and have reworded it.

Pal/src/host/Linux-SGX/enclave_pages.c, line 133 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Add : at the end

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 134 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

page fault(#PF) -> page fault (#PF)

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 136 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

EAUG for the page (at this point the page becomes VALID and may be used by the enclave).

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 140 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Do we guarantee that size is a multiple of alloc_align? For example, could this function receive size = 8? Then we'll have weird arithmetic onaddr and the logic breaks...

I'm fine with an assert(size % g_pal_state.alloc_align == 0). But maybe you want to ALLOC_ALIGN_UP(size). I don't know which one is a better fit here.

Actually, the same for start.

yes, both addr and size are multiple of alloc_align. This is taken care get_enclave_pages(). So arithmetic between these two variables should result in a multiple of alloc_align.

Pal/src/host/Linux-SGX/enclave_pages.c, line 140 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do you use uintptr_t in this function, but you used void*/char* in the other function? I think you should use void*/char* in here.

we use uintptr_t to typecast void* in defines like ALIGN_DOWN_PTR (most defines in api.h) so was little confused and used it here. But will revert to using void*/char*

Pal/src/host/Linux-SGX/enclave_pages.c, line 163 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Given that you have a detailed explanation of the flows in the function top-comment, this feels redundant. I suggest to keep only the first sentence and remove all the rest of this comment:
        /* All new pages have RW permissions initially, so after EAUG/EACCEPT, extend
         * permissions of a VALID enclave page (if needed). */

yes removed.
note: we only add executable permission here, no other flags are being added, so better to state as extend permission

Pal/src/host/Linux-SGX/enclave_pages.c, line 177 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Also, the name is weird. I guess you mean something like "enclave memory regions which were not yet allocated", so maybe heap_ranges_to_alloc?

Ok, have renamed it to heap_ranges_to_alloc

Pal/src/host/Linux-SGX/enclave_pages.c, line 177 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add a comment to this function that unallocated_heap is an array (of size 64 entries) that tracks the freed memory regions, and that this function updates this variable with addr, size entries.

Isn't this explicit when reading get_enclave_pages()? This function is only called from it so not sure why a comment is needed here.

Pal/src/host/Linux-SGX/enclave_pages.c, line 240 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why int64_t? It should be size_t. To check for overflow, just move the assert before assignment:
assert(vma_above->bottom <= vma_current->top);
size_t free_size = vma_above->bottom - vma_current->top;

Modified the logic a bit which removes the need for assert. Using size_t now instead of int64_t

Pal/src/host/Linux-SGX/enclave_pages.c, line 245 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add assert(free_cnt <= EDMM_HEAP_RANGE_CNT).

yes good point, done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 249 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do you need to keep track of vma_current?

Isn't it enough to use vma_above->top and vma_above->bottom here?

No, it isn't. Let us take the below example where the request overlaps with the existing allocation.

          +           +
          +-----------+ top
          | VMA_ABOVE_ABOVE
+---------------------+-------+   ADDR + SIZE, VMA->top
          |           |
          +-----------+ bottom
          |   FREE    |
          +-----------+ top
          | VMA_ABOVE |
          |           |
          +-----------+ bottom
          |   FREE    |
          +-----------+ top
          |           |
+-----------------------------+   ADDR, VMA->bottom
          | VMA_BELOW |
          +-----------+ bottom
          +           +

So the idea here is to identify the free spaces (unallocated regions) and only do EACCEPT on these ranges as the other ranges are already allocated. If we use vma_above->top and vma_above->bottom the free space between vma_below->top and vma_above->bottomis missed. In order to track this region, I usevma_current`.

With new changes, I have renamed vma_current to current_top and is initialized to void* current_top = (vma_below) ? MAX(vma_below->top, vma->bottom) : vma->bottom;. Before merging the vma_above I do store vma_above->top to current_top to find free space between vma_above_above and vma_above. This continues as we go upwards merging VMAs.

Pal/src/host/Linux-SGX/enclave_pages.c, line 285 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Is this comment correct? I think what really happens is there were some VMA merges, and this may have led to overlaps with previously-allocated pages, so these overlapping pages were marked as "freed". And you want to track those page ranges that were marked as "freed". And if there were no "freed" pages, it means that there were no overlaps (and thus no previously-allocated pages), and all VMA pages must be EACCEPTed.

yes, I think. When we hit this condition, it means the allocation is somewhere after the lowest vma->bottom and heap bottom. If there were free spaces between VMAs we would have taken the vma_above path (vma_above->bottom <= vma->top)

Pal/src/host/Linux-SGX/enclave_pages.c, line 286 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do you care about free_cnt and allocated? Shouldn't you only check for freed == 0?

Consider the below example where the allocation request starts from a free region which also happens to be vma_below->top and ends before extending into vma_above.

         +          +
         |          |
         +----------+ top                                                     
         | VMA_ABOVE|                                                         
         |          |                                                         
         +----------+ bottom                                        
+-----------------------------+ ADDR+SIZE, VMA->top                          
         |   FREE   |                                                         
+-----------------------------+ top, ADDR, VMA->bottom                             
         |          |                                                 
         | VMA_BELOW|                                                         
         |          |                                                         
         +----------+ bottom                                                 
         |          |                                                         
         +          +

Here if we just consider freed == 0, the if condition will fail and we miss allocating the free space, that is (addr, addr+size]. So I use allocated to check this free space and free_cnt to ensure that we don't have any other free regions which should be the case (sanity check).

Pal/src/host/Linux-SGX/enclave_pages.c, line 289 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This whole logic with unallocated_heap is confusing to me. Please explain the logic, and we'll see how to do re-write it in a more readable way.

I have added a detailed explanation above. Please let me know if more details are required.

Pal/src/host/Linux-SGX/enclave_pages.c, line 304 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Actually, why do you want to collect the yet-not-allocated page ranges first and then perform get_edmm_page_range() on these ranges? Why not just embed get_edmm_page_range() inside __create_vma_and_merge()?

We currently perform everything under the same global lock anyway. So there won't be any difference in performance, but it will significantly simplify the existing code. And in the future, we may change that (but let's not overcomplicate it now).

Yes we can embed get_edmm_page_range() inside __create_vma_and_merge(). I did consolidation to create a framework for EAUG batch optimization. Why do you think this is complicated? We simply pass an array and store the address ranges. The complicated part lies in identifying these free ranges.

I would prefer it this way until we finalize the EAUG batch optimization as performance won't be any different.

Pal/src/host/Linux-SGX/enclave_pages.c, line 365 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please don't do this pattern of "check and execute inside the IF statement". Extract get_edmm_page_range() on its own line.

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 438 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I suggest to embed free_edmm_page_range() directly in here, instead of tracking.

free ranges can be contiguous and it is better to consolidate and just make a single free_edmm_page_range(). Please see this case, https://github.com/oscarlab/graphene/pull/2190/files#diff-c2e0a370259cd766336e18435809201836adfbe09dc2884127323a4a3e3429f8R428

Pal/src/host/Linux-SGX/ocall_types.h, line 316 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why not void*?

Because the SGX drive IOCTL expects unsigned long and wanted to avoid unnecessary typecasting. Any issues with this or why void* would be preferred?

Pal/src/host/Linux-SGX/ocall_types.h, line 317 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please spaces instead of tabs.

oh sorry, fixed it.

Pal/src/host/Linux-SGX/sgx_api.h, line 57 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

space before (, ditto for the whole PR

I think I have addressed all.

Pal/src/host/Linux-SGX/sgx_main.c, line 458 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

... if EDMM is enabled

I have added it but felt this was explicit.

Pal/src/host/Linux-SGX/sgx_main.c, line 468 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Now we have this code copy-pasted in several places. Can you extract this code in a separate func prot_flags_to_permissions_str()?

yes, done.

python/graphenelibos/sgx_sign.py, line 594 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

heap ("free")

Done.

vijaydhanraj · 2021-03-12T13:58:27Z

Needed to change the commit message and along with it rebased to the latest master. Sorry about it.

mkow

Reviewed 1 of 13 files at r2, 5 of 11 files at r3.
Reviewable status: 10 of 16 files reviewed, 40 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" and "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/enclave_ocalls.h, line 119 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

SGX driver expects unsigned int for nr_pages and so I have used it. Using size_t might expand the size to 8 bytes on x64 systems.

Uh, seriously? That's bad... I guess we don't have a choice then. I hope the in-kernel driver will use proper types.

Pal/src/host/Linux-SGX/enclave_pages.c, line 100 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Ok, thanks for the clarification.

Oops, I meant "we can" -.-
Sorry for confusion.

Pal/src/host/Linux-SGX/enclave_pages.c, line 114 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

gcc didn't seem to complain but looks like other compilers like MSVC might have an issue. I now typecast to char* for arithmetic operation and then recast to void*.

Yeah, I think it complains only with -pedantic.

Pal/src/host/Linux-SGX/ocall_types.h, line 316 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Because the SGX drive IOCTL expects unsigned long and wanted to avoid unnecessary typecasting. Any issues with this or why void* would be preferred?

I'd prefer to push such uglinesses as down as possible and keep proper types in Graphene interfaces.

vijaydhanraj

Reviewable status: 10 of 16 files reviewed, 40 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" and "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/enclave_pages.c, line 100 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Oops, I meant "we can" -.-
Sorry for confusion.

It looks like PAL uses this API _DkGetAllocationAlignment() to update g_pal_state.alloc_align which inturn simply returns PRESET_PAGESIZE.

I am fine with both but if PRESET_PAGESIZE is more readable, I can change it when creating the PR.

Pal/src/host/Linux-SGX/enclave_pages.c, line 114 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Yeah, I think it complains only with -pedantic.

got it, thanks.

Pal/src/host/Linux-SGX/ocall_types.h, line 316 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

I'd prefer to push such uglinesses as down as possible and keep proper types in Graphene interfaces.

Actually, we use this ms_ocall_sgx_range_t struct directly in the IOCTL. So if we want to avoid this, I can create a local struct that is used for the driver IOCTL and typecast payload that is part of ms_ocall_sgx_range_t. Is this preferred?

Something like below,

typedef struct {
    unsigned long start_addr;
    unsigned int nr_pages;
} sgx_range_t ;========> local to `static long sgx_ocall_trim_epc_pages` and `sgx_ocall_notify_accept`

typedef struct {
    void* start_addr;
    size_t nr_pages;
} ms_ocall_sgx_range_t ;=> used elsewhere in graphene like `ocall_trim_epc_pages` and `ocall_notify_accept`

mkow

Reviewable status: 10 of 16 files reviewed, 39 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" and "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/enclave_pages.c, line 100 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

It looks like PAL uses this API _DkGetAllocationAlignment() to update g_pal_state.alloc_align which inturn simply returns PRESET_PAGESIZE.

I am fine with both but if PRESET_PAGESIZE is more readable, I can change it when creating the PR.

Tbh it's quite a mess, we need to clean this up and have only one intended way to get this value. For now whichever you prefer is fine.

Pal/src/host/Linux-SGX/ocall_types.h, line 316 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Actually, we use this ms_ocall_sgx_range_t struct directly in the IOCTL. So if we want to avoid this, I can create a local struct that is used for the driver IOCTL and typecast payload that is part of ms_ocall_sgx_range_t. Is this preferred?

Something like below,
typedef struct {
    unsigned long start_addr;
    unsigned int nr_pages;
} sgx_range_t ;========> local to `static long sgx_ocall_trim_epc_pages` and `sgx_ocall_notify_accept`

typedef struct {
    void* start_addr;
    size_t nr_pages;
} ms_ocall_sgx_range_t ;=> used elsewhere in graphene like `ocall_trim_epc_pages` and `ocall_notify_accept`

Yup, I prefer it this way.

vijaydhanraj

Reviewable status: 6 of 16 files reviewed, 39 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/enclave_pages.c, line 100 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Tbh it's quite a mess, we need to clean this up and have only one intended way to get this value. For now whichever you prefer is fine.

I will leave this for now. Maybe for the final PR, we can have PRESET_PAGESIZE

Pal/src/host/Linux-SGX/ocall_types.h, line 316 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Yup, I prefer it this way.

Done.

dimakuv

Reviewed 6 of 11 files at r3, 5 of 5 files at r4.
Reviewable status: all files reviewed, 14 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

LibOS/shim/test/regression/mmap_file.c, line 1 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Borys would love this test :) I think we should rewrite this whole file (or burn it).

@boryspoplawski What do you think we should do with this file?

Pal/src/host/Linux-SGX/enclave_pages.c, line 86 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Ok, is it fine to have the generic version #define _DECLSPEC_ALIGN(x) __attribute__((aligned(x))) in sgx_arch.h

Why not just use alignas(64) here? The _DECLSPEC_ALIGN name is ugly.

Pal/src/host/Linux-SGX/enclave_pages.c, line 111 at r1 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

ditto

Shouldn't you then use -EFAULT or something similar, not just -1?

Pal/src/host/Linux-SGX/enclave_pages.c, line 304 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Yes we can embed get_edmm_page_range() inside __create_vma_and_merge(). I did consolidation to create a framework for EAUG batch optimization. Why do you think this is complicated? We simply pass an array and store the address ranges. The complicated part lies in identifying these free ranges.

I would prefer it this way until we finalize the EAUG batch optimization as performance won't be any different.

OK, fair enough.

Pal/src/host/Linux-SGX/enclave_pages.c, line 365 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Done.

Not done?

Pal/src/host/Linux-SGX/ocall_types.h, line 318 at r4 (raw file):

    void* start_addr;
    size_t nr_pages;
} ms_ocall_sgx_range_t ;

Redundant space before ;

boryspoplawski

Reviewable status: all files reviewed, 15 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

LibOS/shim/test/regression/mmap_file.c, line 1 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

@boryspoplawski What do you think we should do with this file?

I would just leave it for now, unless it's breaking in some way. We need to rewrite most of the tests anyway (especially Pal/regression)

LibOS/shim/test/regression/mmap_file.c, line 92 at r4 (raw file):

    a[page_size] = 0xff;

    if (signal(SIGBUS, SIG_DFL) == SIG_ERR) {

Why remove it?

vijaydhanraj

Reviewable status: 14 of 16 files reviewed, 15 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

LibOS/shim/test/regression/mmap_file.c, line 92 at r4 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

Why remove it?

Not sure why this was needed as we have already set the SIGBUS handler before accessing the end of the mapped file. I tested this both with native as well as linux-pal and it worked as expected. The only issue was with linuxSGX-pal which we have commented out as part of this PR. So removed this code.

Pal/src/host/Linux-SGX/enclave_pages.c, line 86 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why not just use alignas(64) here? The _DECLSPEC_ALIGN name is ugly.

yes used alignas(64). This is better rather than introducing a new macro.

Pal/src/host/Linux-SGX/enclave_pages.c, line 111 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Shouldn't you then use -EFAULT or something similar, not just -1?

yes, used -EFAULT

Pal/src/host/Linux-SGX/enclave_pages.c, line 365 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Not done?

oh my bad, I misunderstood your comment. I have separated the execution and check into different lines.

Pal/src/host/Linux-SGX/ocall_types.h, line 318 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Redundant space before ;

Done.

boryspoplawski

Reviewable status: 14 of 16 files reviewed, 15 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

LibOS/shim/test/regression/mmap_file.c, line 92 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Not sure why this was needed as we have already set the SIGBUS handler before accessing the end of the mapped file. I tested this both with native as well as linux-pal and it worked as expected. The only issue was with linuxSGX-pal which we have commented out as part of this PR. So removed this code.

It is not needed, but I don't see a reason to touch this file in this PR, especially since you disabled it in SGX tests. This would rather need a proper rewrite than removing random parts of it.

vijaydhanraj

Reviewable status: 14 of 16 files reviewed, 15 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

LibOS/shim/test/regression/mmap_file.c, line 92 at r4 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

It is not needed, but I don't see a reason to touch this file in this PR, especially since you disabled it in SGX tests. This would rather need a proper rewrite than removing random parts of it.

Ok, I will revert the changes and just exclude the test from sgx regression test.

dimakuv

Reviewed 1 of 11 files at r3, 3 of 3 files at r5.
Reviewable status: all files reviewed, 22 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @boryspoplawski, @mkow, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/enclave_pages.c, line 249 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

No, it isn't. Let us take the below example where the request overlaps with the existing allocation.
          +           +
          +-----------+ top
          | VMA_ABOVE_ABOVE
+---------------------+-------+   ADDR + SIZE, VMA->top
          |           |
          +-----------+ bottom
          |   FREE    |
          +-----------+ top
          | VMA_ABOVE |
          |           |
          +-----------+ bottom
          |   FREE    |
          +-----------+ top
          |           |
+-----------------------------+   ADDR, VMA->bottom
          | VMA_BELOW |
          +-----------+ bottom
          +           +
So the idea here is to identify the free spaces (unallocated regions) and only do EACCEPT on these ranges as the other ranges are already allocated. If we use vma_above->top and vma_above->bottom the free space between vma_below->top and vma_above->bottomis missed. In order to track this region, I usevma_current`.

With new changes, I have renamed vma_current to current_top and is initialized to void* current_top = (vma_below) ? MAX(vma_below->top, vma->bottom) : vma->bottom;. Before merging the vma_above I do store vma_above->top to current_top to find free space between vma_above_above and vma_above. This continues as we go upwards merging VMAs.

Ok, thanks for explanation. I think I get it now.

I don't like the names you chose though. We have here two different concepts:

already-existing logic of "free" memory regions: this means "already allocated but will be zeroed out" (pages exist but their data is erased)
your newly-added logic of "unallocated" memory regions: this means "not yet allocated"

But you're using free term for both these cases. Let's separate them. I suggest to rename your new variables:

current_top -> unallocated_region_addr
free_size -> unallocated_region_size
free_cnt- > heap_ranges_to_alloc_cnt

Pal/src/host/Linux-SGX/enclave_pages.c, line 285 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

yes, I think. When we hit this condition, it means the allocation is somewhere after the lowest vma->bottom and heap bottom. If there were free spaces between VMAs we would have taken the vma_above path (vma_above->bottom <= vma->top)

This comment will need to be fixed, but let's first resolve my other comments (and then decide what is the correct text here).

Pal/src/host/Linux-SGX/enclave_pages.c, line 143 at r5 (raw file):

 * may be used by the enclave). The control returns back to enclave.
 * 3. Enclave continues the same EACCEPT and the instruction succeeds this time. */
static int get_edmm_page_range(void* start, size_t size, bool executable) {

Why get? I think alloc_edmm_page_range is a better name.

Pal/src/host/Linux-SGX/enclave_pages.c, line 235 at r5 (raw file):

        struct heap_vma* vma_above_above = LISTP_PREV_ENTRY(vma_above, &g_heap_vma_list, list);

        /* Track free space between VMAs while merging `vma_above`. */

free space- > unallocated memory regions

Pal/src/host/Linux-SGX/enclave_pages.c, line 238 at r5 (raw file):

        if (g_pal_sec.edmm_enable_heap && vma_above->bottom > current_top) {
            size_t free_size = vma_above->bottom - current_top;
            if (free_size) {

This IF is redundant, free_size cannot be zero.

Pal/src/host/Linux-SGX/enclave_pages.c, line 253 at r5 (raw file):

        /* Store vma_above->top to check for any free region between vma_above->top and
         * vma_above_above->bottom. */

This comment is redundant once you change the names. I suggest to simply remove it.

Pal/src/host/Linux-SGX/enclave_pages.c, line 272 at r5 (raw file):

        __free_vma(vma_below);
        vma_below = vma_below_below;

Do I understand correctly that you don't need to add similar logic to this vma_below merging because:

We already covered the bottom part of the to-allocate memory region (starting with addr) during vma_above merging above.
It is impossible to have an unallocated space between vma_below and vma_below_below that needs to be allocated for [addr, size) region.

Pal/src/host/Linux-SGX/enclave_pages.c, line 289 at r5 (raw file):

    if (g_pal_sec.edmm_enable_heap && free_cnt == 0 && allocated > 0) {
        heap_ranges_to_alloc[0].size = allocated;
        heap_ranges_to_alloc[0].addr = addr;

Something is wrong here... What if we have this scenario:

          +           +
+---------------------+-------+   ADDR + SIZE, VMA->top
          |   FREE    |
          |   FREE    |
          +-----------+ top
          |           |
+-----------------------------+   ADDR, VMA->bottom
          | VMA_BELOW |
          +-----------+ bottom
          +           +

Here, the memory region [vma_below->bottom, vma_below->top) is already allocated (and this overlaps with the requested [addr, size)). And the unallocated memory region is [vma_below->top, allocated).

So this line in your code:

heap_ranges_to_alloc[0].addr = addr;

looks incorrect.

Pal/src/host/Linux-SGX/enclave_pages.c, line 363 at r5 (raw file):

     * aren't accepted yet (unallocated heap) and call EACCEPT only on those EPC pages. */
    if (g_pal_sec.edmm_enable_heap && ret != NULL) {
        for (int cnt = 0; cnt < EDMM_HEAP_RANGE_CNT; cnt++) {

Please rename cnt -> i

Pal/src/host/Linux-SGX/enclave_pages.c, line 367 at r5 (raw file):

                break;
            int retval = get_edmm_page_range(heap_ranges_to_alloc[cnt].addr,
                                             heap_ranges_to_alloc[cnt].size, 1);

Please replace the last argument with proper boolean: 1 -> /*executable=*/true.

Also, are you always setting "executable" permission because in the current code, we don't have actual permission changes? So you're jut setting the maximum permissions possible?

Pal/src/host/Linux-SGX/enclave_pages.c, line 381 at r5 (raw file):

    int ret = 0;
    /* TODO: Should we introduce a compiler switch for EDMM? */
    struct edmm_heap_range edmm_free_heap[EDMM_HEAP_RANGE_CNT] = {0};

Please change edmm_free_heap -> heap_ranges_to_free.

Pal/src/host/Linux-SGX/enclave_pages.c, line 382 at r5 (raw file):

    /* TODO: Should we introduce a compiler switch for EDMM? */
    struct edmm_heap_range edmm_free_heap[EDMM_HEAP_RANGE_CNT] = {0};
    int edmm_free_cnt = 0;

Please rename to heap_ranges_to_free_cnt

Pal/src/host/Linux-SGX/enclave_pages.c, line 427 at r5 (raw file):

        }

        freed += MIN(vma->top, addr + size) - MAX(vma->bottom, addr);

Please introduce additional variables for all these and use these additional variables in your new logic below. Currently, you have repetition which is prone to future errors.

Pal/src/host/Linux-SGX/enclave_pages.c, line 432 at r5 (raw file):

            size_t range =  MIN(vma->top, addr + size) - MAX(vma->bottom, addr);

            /* if range is contiguous with previous entry, update addr, size accordinlgy*/

Why do you need this "optimization"? To me it looks like an unnecessary complexity. Why would it be worse even if we have two contiguous ranges in two edmm_free_heap entries? This saves us like 5 hardware instructions, who cares.

Pal/src/host/Linux-SGX/enclave_pages.c, line 438 at r5 (raw file):

            } else {
                assert(edmm_free_cnt < EDMM_HEAP_RANGE_CNT);
                /* found a new non-contiguous range */

This comment will be useless after you remove the "optimization of using the previous entry".

Pal/src/host/Linux-SGX/enclave_pages.c, line 478 at r5 (raw file):

out:
    if (ret >=0 && g_pal_sec.edmm_enable_heap) {
        for (int free_cnt = 0; free_cnt < edmm_free_cnt; free_cnt++) {

free_cnt -> i

vijaydhanraj

Reviewable status: 14 of 15 files reviewed, 22 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/enclave_pages.c, line 249 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Ok, thanks for explanation. I think I get it now.

I don't like the names you chose though. We have here two different concepts:

already-existing logic of "free" memory regions: this means "already allocated but will be zeroed out" (pages exist but their data is erased)

your newly-added logic of "unallocated" memory regions: this means "not yet allocated"

But you're using free term for both these cases. Let's separate them. I suggest to rename your new variables:

current_top -> unallocated_region_addr

free_size -> unallocated_region_size

free_cnt- > heap_ranges_to_alloc_cnt

makes sense but the suggested names seem little long to me. How about?

current_top -> unallocated_region_addr->unallocated_start_addr
free_size->unallocated_region_size-> Have removed this variable and simply used original variables for the size arithmetic. It is only used once. I will remove the debug log for the final PR.
free_cnt->heap_ranges_to_alloc_cnt->unallocated_cnt

Pal/src/host/Linux-SGX/enclave_pages.c, line 143 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why get? I think alloc_edmm_page_range is a better name.

alloc_* could mean similar to SGX1 usage of manipulating already allocated but zeroed out pages. So preferred get as we get these pages from the driver.

I open to changing this but let us see what others think.

Pal/src/host/Linux-SGX/enclave_pages.c, line 235 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

free space- > unallocated memory regions

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 238 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This IF is redundant, free_size cannot be zero.

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 253 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This comment is redundant once you change the names. I suggest to simply remove it.

I felt this needed to be called out as we switch from using vma_below->top to vma_above->top. But fine with removing this.

Pal/src/host/Linux-SGX/enclave_pages.c, line 272 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Do I understand correctly that you don't need to add similar logic to this vma_below merging because:

We already covered the bottom part of the to-allocate memory region (starting with addr) during vma_above merging above.

It is impossible to have an unallocated space between vma_below and vma_below_below that needs to be allocated for [addr, size) region.

yes, correct for both cases.

Pal/src/host/Linux-SGX/enclave_pages.c, line 289 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Something is wrong here... What if we have this scenario:
          +           +
+---------------------+-------+   ADDR + SIZE, VMA->top
          |   FREE    |
          |   FREE    |
          +-----------+ top
          |           |
+-----------------------------+   ADDR, VMA->bottom
          | VMA_BELOW |
          +-----------+ bottom
          +           +
Here, the memory region [vma_below->bottom, vma_below->top) is already allocated (and this overlaps with the requested [addr, size)). And the unallocated memory region is [vma_below->top, allocated).

So this line in your code:
heap_ranges_to_alloc[0].addr = addr;
looks incorrect.

good catch. This should have been MAX(vma_below->top, addr)

Pal/src/host/Linux-SGX/enclave_pages.c, line 363 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please rename cnt -> i

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 367 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please replace the last argument with proper boolean: 1 -> /*executable=*/true.

Also, are you always setting "executable" permission because in the current code, we don't have actual permission changes? So you're jut setting the maximum permissions possible?

yes, currently we always set "executable permission but I want to keep the function prototype so that we can use it in the future.

Pal/src/host/Linux-SGX/enclave_pages.c, line 381 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please change edmm_free_heap -> heap_ranges_to_free.

Done

Pal/src/host/Linux-SGX/enclave_pages.c, line 382 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please rename to heap_ranges_to_free_cnt

This is only used for indexing heap_ranges_to_free[heap_ranges_to_free_cnt] and it makes the variable too long and somewhat repetitive. How about just free_cnt or simply cnt?

Pal/src/host/Linux-SGX/enclave_pages.c, line 427 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please introduce additional variables for all these and use these additional variables in your new logic below. Currently, you have repetition which is prone to future errors.

Done.

Pal/src/host/Linux-SGX/enclave_pages.c, line 432 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do you need this "optimization"? To me it looks like an unnecessary complexity. Why would it be worse even if we have two contiguous ranges in two edmm_free_heap entries? This saves us like 5 hardware instructions, who cares.

each new entry in edmm_free_heap corresponds to a range to be freed. Now to free the range, I call free_edmm_page_range() which has the cost of 2 ocalls + 2 IOCTLS. So it is better to consolidate as much as possible before invoking free_edmm_page_range()

Pal/src/host/Linux-SGX/enclave_pages.c, line 478 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

free_cnt -> i

Done.

dimakuv

Reviewed 1 of 1 files at r6.
Reviewable status: all files reviewed, 13 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/enclave_pages.c, line 249 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

makes sense but the suggested names seem little long to me. How about?

current_top -> unallocated_region_addr->unallocated_start_addr

free_size->unallocated_region_size-> Have removed this variable and simply used original variables for the size arithmetic. It is only used once. I will remove the debug log for the final PR.

free_cnt->heap_ranges_to_alloc_cnt->unallocated_cnt

OK, looks much better now!

Pal/src/host/Linux-SGX/enclave_pages.c, line 253 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I felt this needed to be called out as we switch from using vma_below->top to vma_above->top. But fine with removing this.

? What do you mean by "from using vma_below->top..."? There is no vma_below used here.

But I'm fine with keeping the comment.

Pal/src/host/Linux-SGX/enclave_pages.c, line 432 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

each new entry in edmm_free_heap corresponds to a range to be freed. Now to free the range, I call free_edmm_page_range() which has the cost of 2 ocalls + 2 IOCTLS. So it is better to consolidate as much as possible before invoking free_edmm_page_range()

Good point. But still, does it ever happen? Can you actually have two ranges-to-free being contiguous? I still think that this optimization is premature and doesn't happen in reality (or happens very rarely).

Pal/src/host/Linux-SGX/enclave_pages.c, line 426 at r6 (raw file):

        void* free_heap_top = MIN(vma->top, addr + size);
        void* free_heap_bottom = MAX(vma->bottom, addr);
        size_t range = free_heap_top - free_heap_bottom;

To be honest, this range variable seems a bit too much :) I would prefer to remove it and just use the subtraction, but I'm not blocking on this.

Pal/src/host/Linux-SGX/enclave_pages.c, line 431 at r6 (raw file):

            /* if range is contiguous with previous entry, update addr, size accordinlgy*/
            if (free_cnt > 0 &&
                (free_heap_bottom + range) == heap_ranges_to_free[free_cnt-1].addr) {

free_heap_bottom + range -- isn't it just free_heap_top?

vijaydhanraj

Reviewable status: 14 of 15 files reviewed, 13 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, @mkow, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/enclave_pages.c, line 253 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

? What do you mean by "from using vma_below->top..."? There is no vma_below used here.

But I'm fine with keeping the comment.

Initially before entering the while loop we set to unallocated_start_addr = (vma_below) ? MAX(vma_below->top, vma->bottom) : vma->bottom; and as we go up we set it to unallocated_start_addr = vma_above->top; So thought of calling it out. If you have suggestions to reword it, please let me know.

Pal/src/host/Linux-SGX/enclave_pages.c, line 432 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Good point. But still, does it ever happen? Can you actually have two ranges-to-free being contiguous? I still think that this optimization is premature and doesn't happen in reality (or happens very rarely).

It can happen but agree it is a little rare. I haven't seen it with our regression tests but we haven't tested real workloads extensively to rule out such a case. Given the cost of enclave exits, I would prefer to have this optimization until we have more results.
I have added assert(0) just to catch this case.

Pal/src/host/Linux-SGX/enclave_pages.c, line 426 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

To be honest, this range variable seems a bit too much :) I would prefer to remove it and just use the subtraction, but I'm not blocking on this.

I thought we preferred new variables to avoid repetition. Here we reuse range at 3 places.

Pal/src/host/Linux-SGX/enclave_pages.c, line 431 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

free_heap_bottom + range -- isn't it just free_heap_top?

yes, we can just use, free_heap_top.

dimakuv

Reviewed 1 of 11 files at r3, 1 of 1 files at r7.
Reviewable status: all files reviewed, 7 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " and "DONTMERGE" found in commit messages' one-liners (waiting on @boryspoplawski, @mkow, @vijaydhanraj, and @yamahata)

Pal/src/host/Linux-SGX/enclave_pages.c, line 143 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

alloc_* could mean similar to SGX1 usage of manipulating already allocated but zeroed out pages. So preferred get as we get these pages from the driver.

I open to changing this but let us see what others think.

Ok, I'm fine with this name.

Pal/src/host/Linux-SGX/enclave_pages.c, line 253 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Initially before entering the while loop we set to unallocated_start_addr = (vma_below) ? MAX(vma_below->top, vma->bottom) : vma->bottom; and as we go up we set it to unallocated_start_addr = vma_above->top; So thought of calling it out. If you have suggestions to reword it, please let me know.

OK, understood now. Let's keep the comment as is, I'm fine.

Pal/src/host/Linux-SGX/enclave_pages.c, line 432 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

It can happen but agree it is a little rare. I haven't seen it with our regression tests but we haven't tested real workloads extensively to rule out such a case. Given the cost of enclave exits, I would prefer to have this optimization until we have more results.
I have added assert(0) just to catch this case.

No, let's not add an assert(0). I don't say that this is an impossible code path. I'm just saying that it is rare (but possible).

I suggest to remove this assert and modify the comment like this:

/* if range is contiguous with previous entry, update addr and size accordingly; this case may be rare
 * but the below optimization still saves us 2 OCALLs and 2 IOCTLs, so should be worth it */

vijaydhanraj

Reviewable status: 1 of 26 files reviewed, 7 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "DONTMERGE" found in commit messages' one-liners (waiting on @boryspoplawski, @dimakuv, @mkow, and @yamahata)

a discussion (no related file):

Previously, mkow (Michał Kowalczyk) wrote…

ack, good luck then!

Added optimizations to naive EDMM implementation as part of this PR. Also, added a mmap multi-threaded regression test to assess g_heap_vma_lock overhead. Based on my assessment, Hybrid + Lazy free optimization provides much value in the absence of Intel SGX OOT driver optimizations (batch allocation).

Pal/src/host/Linux-SGX/enclave_pages.c, line 432 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

No, let's not add an assert(0). I don't say that this is an impossible code path. I'm just saying that it is rare (but possible).

I suggest to remove this assert and modify the comment like this:
/* if range is contiguous with previous entry, update addr and size accordingly; this case may be rare
 * but the below optimization still saves us 2 OCALLs and 2 IOCTLs, so should be worth it */

Done.

Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>

This test case passes without EDMM as all the enclave pages are already EADDed but with EDMM, a SIGBUS is generated since pages are only added when needed. Since SIGBUS isn't implemented correctly for lack of memory protection, skipping the mmap test case. Also, updated the mmap test case to remove default signal handler. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>

To reduce the enclave exits, pre-allocate minimal heap from top as required by application and allocate the remaining dynamically using EDMM. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>

SGX driver allocates EPC pages dynamically by faulting in pages one at a time. This incurs a huge overhead due to enclave exit for each page. To overcome this issue, a new IOCTL has been introduced in the SGX driver which can take the requested range and EAUG all the pages in one shot. This patch updates graphene memory allocator to use the newly added SGX driver IOCTL to pass range of EPC pages to be allocated and on return simply EACCEPTs the EAUG'ed pages. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>

This optimization keeps track of free EPC pages and lazily frees them when the amount of EPC that is not freed exceeds a certain threshold that is set from the manifest file `sgx.edmm_lazyfree_th`. This optimization reduces enclave exists and at the same time doesn't hog EPC pages that aren't required anymore. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>

Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>

yamahata suggested changes Feb 25, 2021

View reviewed changes

dimakuv suggested changes Feb 25, 2021

View reviewed changes

mkow suggested changes Feb 26, 2021

View reviewed changes

dimakuv suggested changes Feb 26, 2021

View reviewed changes

dimakuv reviewed Feb 26, 2021

View reviewed changes

mkow reviewed Feb 26, 2021

View reviewed changes

dimakuv suggested changes Mar 1, 2021

View reviewed changes

mkow reviewed Mar 1, 2021

View reviewed changes

vijaydhanraj commented Mar 5, 2021

View reviewed changes

dimakuv suggested changes Mar 8, 2021

View reviewed changes

mkow suggested changes Mar 8, 2021

View reviewed changes

mkow reviewed Mar 10, 2021

View reviewed changes

yamahata reviewed Mar 10, 2021

View reviewed changes

vijaydhanraj commented Mar 12, 2021

View reviewed changes

vijaydhanraj force-pushed the vdhanraj/DONTMERGE_edmm_dynamic_heap_support_v1 branch from dcb0a9d to 43fd00f Compare March 12, 2021 13:56

mkow suggested changes Mar 13, 2021

View reviewed changes

vijaydhanraj commented Mar 15, 2021

View reviewed changes

mkow reviewed Mar 15, 2021

View reviewed changes

vijaydhanraj commented Mar 16, 2021

View reviewed changes

dimakuv suggested changes Mar 24, 2021

View reviewed changes

boryspoplawski suggested changes Mar 24, 2021

View reviewed changes

vijaydhanraj commented Mar 25, 2021

View reviewed changes

boryspoplawski suggested changes Mar 25, 2021

View reviewed changes

vijaydhanraj commented Mar 25, 2021

View reviewed changes

dimakuv suggested changes Mar 25, 2021

View reviewed changes

vijaydhanraj commented Mar 26, 2021

View reviewed changes

dimakuv suggested changes Mar 29, 2021

View reviewed changes

vijaydhanraj commented Apr 1, 2021

View reviewed changes

dimakuv suggested changes Apr 12, 2021

View reviewed changes

mkow mentioned this pull request Apr 27, 2021

Adding dynamic memory management features (EDMM) with SGX2 supports #234

Closed

vijaydhanraj force-pushed the vdhanraj/DONTMERGE_edmm_dynamic_heap_support_v1 branch from 8ac1a78 to a4f2f63 Compare June 29, 2021 03:06

vijaydhanraj commented Jun 29, 2021

View reviewed changes

dimakuv mentioned this pull request Jul 14, 2021

Memory protection is silently disabled on SGX #504

Closed

vijaydhanraj added 6 commits October 4, 2021 12:32

[Pal/Linux-SGX] DONTMERGE: Initial EDMM dynamic heap implementation

a8e15d5

Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>

[Pal/Linux-SGX] EDMM hybrid allocation

bace6f8

To reduce the enclave exits, pre-allocate minimal heap from top as required by application and allocate the remaining dynamically using EDMM. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>

[LibOS] regression: EDMM heap multi-thread mmap performance test

589f963

Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>

vijaydhanraj force-pushed the vdhanraj/DONTMERGE_edmm_dynamic_heap_support_v1 branch from a4f2f63 to 589f963 Compare October 4, 2021 22:38

vijaydhanraj mentioned this pull request Jun 24, 2022

RFC: SGX2 support in Gramine (Phase 1) gramineproject/gramine#683

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pal/Linux-SGX] DONTMERGE: Initial EDMM dynamic heap implementation #2190

[Pal/Linux-SGX] DONTMERGE: Initial EDMM dynamic heap implementation #2190

vijaydhanraj commented Feb 24, 2021 •

edited by dimakuv

yamahata left a comment

dimakuv left a comment

mkow left a comment

dimakuv left a comment

dimakuv left a comment

mkow left a comment

dimakuv left a comment

mkow left a comment

vijaydhanraj left a comment

dimakuv left a comment

mkow left a comment

mkow left a comment

yamahata left a comment

vijaydhanraj left a comment

vijaydhanraj commented Mar 12, 2021

mkow left a comment

vijaydhanraj left a comment

mkow left a comment

vijaydhanraj left a comment

dimakuv left a comment

boryspoplawski left a comment

vijaydhanraj left a comment

boryspoplawski left a comment

vijaydhanraj left a comment •

edited

dimakuv left a comment

vijaydhanraj left a comment

dimakuv left a comment

vijaydhanraj left a comment

dimakuv left a comment

vijaydhanraj left a comment

[Pal/Linux-SGX] DONTMERGE: Initial EDMM dynamic heap implementation #2190

Are you sure you want to change the base?

[Pal/Linux-SGX] DONTMERGE: Initial EDMM dynamic heap implementation #2190

Conversation

vijaydhanraj commented Feb 24, 2021 • edited by dimakuv

Description of the changes

How to test this PR?

yamahata left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

yamahata left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

vijaydhanraj commented Mar 12, 2021

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

boryspoplawski left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

boryspoplawski left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment • edited

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

vijaydhanraj commented Feb 24, 2021 •

edited by dimakuv

vijaydhanraj left a comment •

edited