Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pal sgx fast clock gettimeofday #1834

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jonathan-sha
Copy link

@jonathan-sha jonathan-sha commented Apr 2, 2024

Description of the changes

This is a re-implementation of _palSystemTimeQuery in a way that doesn't require steady TSC feature and works on previously unsupported hypervisors.

See full explanation and discussion in #1799

Fixes #1799

How to test this PR?


This change is Reviewable

@jonathan-sha jonathan-sha force-pushed the pal-sgx-fast-clock-gettimeofday branch from 8e76328 to 9679f96 Compare April 2, 2024 15:31
return &fast_clock->time_points[timepoint_index(descriptor)];
}

static bool is_rdtsc_available(void) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this method is needed - can we rely on the RDTSC emulation feature?

Maybe we can do the following - have a global g_is_rdtsc_emulated which is set under the FIRST_TIME() guard in the emulator. This method will simply call get_tsc() then check the global. We can then completely remove this logic from pal_sgx_main.

extern fast_clock_t g_fast_clock;

int fast_clock__get_time(fast_clock_t* fast_clock, uint64_t* time_micros, bool force_new_timepoint);
void fast_clock__get_timezone(const fast_clock_t* fast_clock, int* tz_minutewest, int* tz_dsttime);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing implementation

void fast_clock__disable(fast_clock_t* fast_clock);

#ifdef __cplusplus
} /* extern int */
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo - extern "C"

Copy link
Contributor

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 11 files reviewed, 28 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jonathan-sha)

a discussion (no related file):
I just briefly looked at the PR and left some comments. Didn't go deep into the logic itself.

Please fix according to the comments, and then we could start the review in earnest.



pal/src/host/linux-sgx/enclave_ocalls.h line 92 at r1 (raw file):

int ocall_futex(uint32_t* uaddr, int op, int val, uint64_t* timeout_us);

int ocall_reset_time(uint64_t* microsec, uint64_t* tsc, int* tz_minuteswest, int* tz_dsttime);

Can we remove the tz_ args and handling from this PR? You don't use them here anyway, and it's harder to review.


pal/src/host/linux-sgx/enclave_ocalls.c line 1810 at r1 (raw file):

        *microsec_ptr = MAX(microsec, expected_microsec);
        if (tsc_ptr != NULL) {
            *tsc_ptr = COPY_UNTRUSTED_VALUE(&ocall_gettime_args->tsc);

Do you have some sanity checks on the returned-by-host tsc value? It must be at least non-decreasing. We have a similar check for returned-by-host microsec. This is to prevent attacks.


pal/src/host/linux-sgx/enclave_ocalls.c line 1825 at r1 (raw file):

{
    return ocall_reset_time(microsec_ptr, NULL, NULL, NULL);
}

That's a bad security practice -- you added an additional attack surface (a new OCALL) for no good reason. Just expand existing ocall_gettime to have an additional argument tsc_ptr, and remove ocall_reset_time().


pal/src/host/linux-sgx/host_ocalls.c line 608 at r1 (raw file):

    struct timeval tv;
    struct timezone tz;
    unsigned long long tsc = __rdtsc();

We have get_tsc() which does the same. Please remove x86intrin.h


pal/src/host/linux-sgx/pal_main.c line 561 at r1 (raw file):

                    "not expose corresponding CPUID leaves). This degrades performance.");
    }
}

Why did you remove print_warning_on_invariant_tsc()?


pal/src/host/linux-sgx/pal_misc.c line 319 at r1 (raw file):

    /* hypervisor-specific CPUID leaf functions (0x40000000 - 0x400000FF) start here */
    {.leaf = 0x40000000, .zero_subleaf = true, .cache = true},  /* CPUID Info */
    {.leaf = 0x40000010, .zero_subleaf = true, .cache = true},  /* VMWare-style Timing Info */

These 2 leaves won't be needed now, right? We can remove them.


pal/src/host/linux-sgx/utils/fast_clock.h line 2 at r1 (raw file):

#ifndef _FAST_CLOCK_H_
#define _FAST_CLOCK_H_

We use #pragma once


pal/src/host/linux-sgx/utils/fast_clock.h line 9 at r1 (raw file):

#ifdef __cplusplus
extern "C" {
#endif

There is nothing C++ in our codebase, please remove these guards.


pal/src/host/linux-sgx/utils/fast_clock.h line 13 at r1 (raw file):

enum fast_clock_state_e
{

We have a different coding style (based on Google C/C++ coding style), see https://gramine.readthedocs.io/en/stable/devel/coding-style.html


pal/src/host/linux-sgx/utils/fast_clock.h line 19 at r1 (raw file):

    FC_STATE_INIT,

    FC_STATE_RDTSC_DISABLED,

I don't like the FC_, it's not immediately obvious what it is.

Hm, is this enum really needed to be in the header? It looks like it can be embedded in the source file itself. Then it won't be visible in other compilation units, and I'd be fine with FC_ since it will become local-to-single-file.


pal/src/host/linux-sgx/utils/fast_clock.h line 39 at r1 (raw file):

typedef union
{
    #pragma pack(push, 1)

Why you need this bit-compression?


pal/src/host/linux-sgx/utils/fast_clock.c line 1 at r1 (raw file):

#include <string.h>

You should use "api.h"


pal/src/host/linux-sgx/utils/fast_clock.c line 10 at r1 (raw file):

/**
 * FastClock

Please reformat to 100-chars-per-line.


pal/src/host/linux-sgx/utils/fast_clock.c line 28 at r1 (raw file):

 * *** Implementation ***
 *
 * FastClock is implemented as a state machine. This was done since we don't have a good portable way to get the cpu clock frequency.

It would be nicer if you would ASCII-draw the state machine with transitions.


pal/src/host/linux-sgx/utils/fast_clock.c line 81 at r1 (raw file):

#ifndef SEC_USEC
#define SEC_USEC                        ((uint64_t)1000000)
#endif

We have some macro for this in api.h, please check (I don't remember the name)


pal/src/host/linux-sgx/utils/fast_clock.c line 84 at r1 (raw file):

#define RDTSC_CALIBRATION_TIME          ((uint64_t)1 * SEC_USEC)
#define RDTSC_RECALIBRATION_INTERVAL    ((uint64_t)120 * SEC_USEC)

How did you decide on these values? Please add a comment.


pal/src/host/linux-sgx/utils/fast_clock.c line 139 at r1 (raw file):

static inline long reset_timepoint(fast_clock_timepoint_t* timepoint)
{
    int ret = ocall_reset_time(&timepoint->t0_usec, &timepoint->tsc0, &timepoint->tz_minutewest, &timepoint->tz_dsttime);

Need to analyze security and validate received-from-host values


pal/src/host/linux-sgx/utils/fast_clock.c line 148 at r1 (raw file):

}

static inline fast_clock_desc_t desc_atomic_load(const fast_clock_t* fast_clock, int mo)

What's the point in this helper function? It brings no benefit.


pal/src/host/linux-sgx/utils/fast_clock.c line 155 at r1 (raw file):

}

static inline void desc_atomic_store(fast_clock_t* fast_clock, fast_clock_desc_t new_desc, int mo)

What's the point in this helper function? It brings no benefit.


pal/src/host/linux-sgx/utils/fast_clock.c line 164 at r1 (raw file):

static inline int handle_state_rdtsc(fast_clock_t* fast_clock, fast_clock_desc_t descriptor, uint64_t* time_usec, bool force_new_timepoint);
static inline int handle_state_rdtsc_recalibrate(fast_clock_t* fast_clock, fast_clock_desc_t descriptor, uint64_t* time_usec);
static int handle_state_rdtsc_disabled(uint64_t* time_usec);

This is ugly, why can't you define functions in the order of dependency?


pal/src/host/linux-sgx/utils/fast_clock.c line 166 at r1 (raw file):

static int handle_state_rdtsc_disabled(uint64_t* time_usec);

int fast_clock__get_time(fast_clock_t* fast_clock, uint64_t* time_usec, bool force_new_timepoint)

We don't use the __ style. Just one _ is enough.


pal/src/host/linux-sgx/utils/fast_clock.c line 290 at r1 (raw file):

    // all callers in this state will perform an OCALL - no need to set the change_state_guard before OCALLing
    uint64_t tmp_tsc = 0;
    int ret = ocall_reset_time(time_usec, &tmp_tsc, NULL, NULL);

Need to analyze security and validate received-from-host values


pal/src/host/linux-sgx/utils/fast_clock.c line 350 at r1 (raw file):

    uint64_t tsc = 0;
    int ret = ocall_reset_time(time_usec, &tsc, NULL, NULL);

Need to analyze security and validate received-from-host values


pal/src/host/linux-sgx/utils/fast_clock.c line 362 at r1 (raw file):

    return ret;
}

Please add newlines at the end of each file.

@jonathan-sha
Copy link
Author

Thanks for the review!

pal/src/host/linux-sgx/enclave_ocalls.h line 92 at r1 (raw file):

int ocall_futex(uint32_t* uaddr, int op, int val, uint64_t* timeout_us);

int ocall_reset_time(uint64_t* microsec, uint64_t* tsc, int* tz_minuteswest, int* tz_dsttime);

Can we remove the tz_ args and handling from this PR? You don't use them here anyway, and it's harder to review.

yes, I can remove this. There's currently an "unimplemented" comment in the gettimeofday libos method, as well as issue #466 - do you prefer I fix this as well (at least partially, for gettimeofday())?

pal/src/host/linux-sgx/enclave_ocalls.c line 1825 at r1 (raw file):

{
    return ocall_reset_time(microsec_ptr, NULL, NULL, NULL);
}

That's a bad security practice -- you added an additional attack surface (a new OCALL) for no good reason. Just expand existing ocall_gettime to have an additional argument tsc_ptr, and remove ocall_reset_time().

This is what I did actually, I renamed ocall_gettime to ocall_reset_time and added the new parameters. I kept ocall_gettime for compatibility with the rest of the code. I can rename this back to ocall_gettime and fix accordingly.
Regarding caching sanity checking rdtsc - I read somewhere that a privileged user can reset the TSC counter, which makes it inherently unsafe. But it seems like I was wrong (?). I will add two get_tsc calls before and after the ocall and validate that the host-side tsc is between the two. Do you think I still need to check that it monotonically increases (this is guaranteed in x86)?

pal/src/host/linux-sgx/host_ocalls.c line 608 at r1 (raw file):

    struct timeval tv;
    struct timezone tz;
    unsigned long long tsc = __rdtsc();

We have get_tsc() which does the same. Please remove x86intrin.h

This is the host code (outside of sgx) - should I still use the enclave reimplementation of rdtsc?

pal/src/host/linux-sgx/pal_main.c line 561 at r1 (raw file):

                    "not expose corresponding CPUID leaves). This degrades performance.");
    }
}

Why did you remove print_warning_on_invariant_tsc()?

I need to rework the initialization a bit. I'll re-add this print, but I'm not sure we need it to happen in the main function.

pal/src/host/linux-sgx/utils/fast_clock.h line 39 at r1 (raw file):

typedef union
{
    #pragma pack(push, 1)

Why you need this bit-compression?

I changed this to a bitfield in my company's codebase (I'll pull in these changes). I just wanted to make sure this struct fits in a uint32_t which we can load \ store atomically.

Copy link
Contributor

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 11 files reviewed, 28 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jonathan-sha)


pal/src/host/linux-sgx/enclave_ocalls.h line 92 at r1 (raw file):

do you prefer I fix this as well (at least partially, for gettimeofday())?

Sorry, don't understand your proposal. Fix how?

To make it clear, I suggest the following:

  1. Remove in this PR everything about timezones.
  2. Create a follow-up PR to this one that adds timezones (and thus fixes [LibOS] time zone is incorrect in syscall gettimeofday() or glibc localtime() or bash command date #466).

pal/src/host/linux-sgx/enclave_ocalls.c line 1825 at r1 (raw file):

I kept ocall_gettime for compatibility with the rest of the code.

No reason to keep compatibility. Just change the code. (These are internal Gramine functions, so can be changed without caring about compatibility issues.)


pal/src/host/linux-sgx/host_ocalls.c line 608 at r1 (raw file):

Previously, jonathan-sha (Jonathan Shamir) wrote…

This is the host code (outside of sgx) - should I still use the enclave reimplementation of rdtsc?

We have this func in cpu.h:

static inline uint64_t get_tsc(void) {

The cpu.h header is independent from enclave/non-enclave environment. So yes, just use it.


pal/src/host/linux-sgx/utils/fast_clock.h line 39 at r1 (raw file):

Previously, jonathan-sha (Jonathan Shamir) wrote…

I changed this to a bitfield in my company's codebase (I'll pull in these changes). I just wanted to make sure this struct fits in a uint32_t which we can load \ store atomically.

I see. You can change to a bitfield here too.

Also, please add a static_assert(sizeof (fast_clock_desc_t) == sizeof(uint64_t)) to make it explicit. Plus add a comment that this struct must be accessed with atomic ops.

Signed-off-by: Jonathan Shamir <jsham92@gmail.com>
Signed-off-by: Jonathan Shamir <jsham92@gmail.com>
…ttimeofday)

Signed-off-by: Jonathan Shamir <jsham92@gmail.com>
@jonathan-sha jonathan-sha force-pushed the pal-sgx-fast-clock-gettimeofday branch 3 times, most recently from 51e6d80 to 4927cdb Compare April 8, 2024 12:06
Signed-off-by: Jonathan Shamir <jsham92@gmail.com>
Signed-off-by: Jonathan Shamir <jsham92@gmail.com>
@jonathan-sha jonathan-sha force-pushed the pal-sgx-fast-clock-gettimeofday branch from 4927cdb to 7f41646 Compare April 8, 2024 14:01
@jonathan-sha
Copy link
Author

@dimakuv I updated the MR, fixed most of your comments.

Two big open issues that I would like to get feedback on -

  1. is_rdtsc_available - since gramine has rdtsc emulation, I think we can take this logic out of fast-clock, and move this into the error handler for rdtsc emulation. I can implement a new init_gettime() method that calls get_tsc(), then checks a global to see if we're in emulation mode, and fast_clock_enable() only if we're not. We can then remove all the cpuid logic from the init state of fast-clock (and bring back the warning log I removed).
  2. should we update the gettimeofday() tests? I'm not sure they're elaborate enough. In my company's codebase we have cycle calculation, comparison with native \ ocall \ fast-clock implementation, multi-threaded stress, checking time drift with host. Currently gramine has multi-threaded stress, but it's iterations based and not time based (so fast-clock won't advance through all it's states). We should add a 5-minute stress test.

@jonathan-sha jonathan-sha marked this pull request as ready for review April 8, 2024 14:55
Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replied to 1. under the same discussion in the code.

should we update the gettimeofday() tests?

We could add/improve some, but I'm not sure if it's worth comparing to the host - I'm afraid that they may fail randomly if we get unlucky with a CI worker load and we'll end up debugging CI failures which are just flaky tests, not real bugs.

We should add a 5-minute stress test.

No, we shouldn't :) 5 minutes is really a lot of CI time spent on a single feature.

Reviewed all commit messages.
Reviewable status: 0 of 12 files reviewed, 31 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @jonathan-sha)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I just briefly looked at the PR and left some comments. Didn't go deep into the logic itself.

Please fix according to the comments, and then we could start the review in earnest.

Same. Will do a more throughout review later.



-- commits line 3 at r2:

Suggestion:

[PAL/Linux-SGX] Add ocall_reset_clock which returns tsc and timezone info

-- commits line 7 at r2:
I don't think there's a point in splitting this change into 4 commits, as far as I understand they can't/shouldn't be used separately, so they aren't really independent of each other.


-- commits line 19 at r2:
Please use fixup commits when adding fixups, see https://gramine.readthedocs.io/en/latest/devel/contributing.html#pr-life-cycle. And in general, please read that contributing guide :)


pal/src/host/linux-sgx/enclave_ocalls.h line 92 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

do you prefer I fix this as well (at least partially, for gettimeofday())?

Sorry, don't understand your proposal. Fix how?

To make it clear, I suggest the following:

  1. Remove in this PR everything about timezones.
  2. Create a follow-up PR to this one that adds timezones (and thus fixes [LibOS] time zone is incorrect in syscall gettimeofday() or glibc localtime() or bash command date #466).

+1, I'd skip the timezones problem for now and focus on merging the core functionality first.


pal/src/host/linux-sgx/utils/fast_clock.c line 226 at r1 (raw file):

Previously, jonathan-sha (Jonathan Shamir) wrote…

Not sure if this method is needed - can we rely on the RDTSC emulation feature?

Maybe we can do the following - have a global g_is_rdtsc_emulated which is set under the FIRST_TIME() guard in the emulator. This method will simply call get_tsc() then check the global. We can then completely remove this logic from pal_sgx_main.

Yeah, sounds like a much better idea IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[PAL-SGX] alternative gettimeofday() implementation for gramine
3 participants