Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LibOS,PAL] Emulate file-backed mmap via PAL read/write APIs #1818

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

dimakuv
Copy link
Contributor

@dimakuv dimakuv commented Mar 14, 2024

Description of the changes

Previously, the chroot FS (plain host-backed files) used the PalStreamMap() PAL API for file-backed mmap. This had three problems: 1) need to implement a non-trivial map callback in PALs; 2) discrepancy between map implementations in different PALs; 3) hard to debug file-mmap bugs because they only reproduced on SGX
(gramine-sgx) PAL and not on Linux (gramine-direct) PAL.

Note that other FSes already used emulated file-backed mmap: tmpfs and encrypted emulate such mmaps via PalStreamRead() and PalStreamWrite().

This PR switches chroot FS to use emulated file-backed mmap. This way, chroot becomes similar in implementation to tmpfs and encrypted FSes. Only shm FS still uses PalStreamMap() because devices with shared memory have non-standard semantics of mmaps. Corresponding file_map() functions in PAL are removed.

This switch to emulated mmap uncovered several bugs:

  • Underlying file may be shorter than the requested mmap size. In this case access beyond the last file-backed page must cause SIGBUS. Previously this semantics worked only on gramine-direct and wasn't implemented on gramine-sgx (with EDMM).
  • As a consequence of the semantics above, file-growing ftruncate() on already-mmapped file must make newly extended file contents accessible. Previously it didn't work on gramine-sgx (with EDMM), now it is resolved via prot_refresh_mmaped_from_file_handle() call.
  • msync() must update file contents with the mmapped-in-process contents, but only those parts that do not exceed the file size. Previously there was a bug that msync'ed even the exceeding parts.
  • Applications expect msync(MS_ASYNC) to update file contents before the next app access to the file. Gramine instead ignored such requests, leading to accessing stale contents. We fix this bug by treating MS_ASYNC the same way as MS_SYNC. This bug was detected on LTP test msync01.

Several FS tests are enabled on SGX now. Generally, gramine-sgx now supports shared file-backed mappings, i.e. mmap(MAP_SHARED).

Extracted from #1812.

How to test this PR?

Several new tests are enabled. Maybe some new LTP tests can be enabled (didn't check).


This change is Reviewable

Previously, the `chroot` FS (plain host-backed files) used the
`PalStreamMap()` PAL API for file-backed mmap. This had three problems:
1) need to implement a non-trivial `map` callback in PALs;
2) discrepancy between `map` implementations in different PALs;
3) hard to debug file-mmap bugs because they only reproduced on SGX
   (`gramine-sgx`) PAL and not on Linux (`gramine-direct`) PAL.

Note that other FSes already used emulated file-backed mmap: `tmpfs` and
`encrypted` emulate such mmaps via `PalStreamRead()` and
`PalStreamWrite()`.

This commit switches `chroot` FS to use emulated file-backed mmap. This
way, `chroot` becomes similar in implementation to `tmpfs` and
`encrypted` FSes. Only `shm` FS still uses `PalStreamMap()` because
devices with shared memory have non-standard semantics of mmaps.
Corresponding `file_map()` functions in PAL are removed.

This switch to emulated mmap uncovered several bugs:
- Underlying file may be shorter than the requested mmap size. In this
  case access beyond the last file-backed page must cause SIGBUS.
  Previously this semantics worked only on `gramine-direct` and wasn't
  implemented on `gramine-sgx` (with EDMM).
- As a consequence of the semantics above, file-growing `ftruncate()` on
  already-mmapped file must make newly extended file contents
  accessible. Previously it didn't work on `gramine-sgx` (with EDMM),
  now it is resolved via `prot_refresh_mmaped_from_file_handle()` call.
- `msync()` must update file contents with the mmapped-in-process
  contents, but only those parts that do not exceed the file size.
  Previously there was a bug that msync'ed even the exceeding parts.
- Applications expect `msync(MS_ASYNC)` to update file contents before
  the next app access to the file. Gramine instead ignored such
  requests, leading to accessing stale contents. We fix this bug by
  treating `MS_ASYNC` the same way as `MS_SYNC`. This bug was detected
  on LTP test `msync01`.

Several FS tests are enabled on SGX now. Generally, `gramine-sgx` now
supports shared file-backed mappings, i.e. `mmap(MAP_SHARED)`.

Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
Copy link
Contributor Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 19 files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel)


libos/test/ltp/manifest.template line 20 at r1 (raw file):


  # many LTP multi-process tests rely on shared-memory IPC via `mmap(MAP_SHARED, </dev/shm fd>)`
  { type = "untrusted_shm", path = "/dev/shm", uri = "dev:/dev/shm" },

This is taken from #1718. So I'd say that that PR is the prerequisite of this one.

This is required because LTP uses shared file-backed mmaps without explicit msyncs or munmaps (recall that Gramine requires explicit points like msync() and munmap() to synchronize mappings with file contents).


pal/include/pal/pal.h line 426 at r1 (raw file):

 * `PalStreamRead` and `PalStreamWrite`. Use `PalVirtualMemoryFree` to unmap this mapping.
 */
int PalStreamMap(PAL_HANDLE handle, void* addr, pal_prot_flags_t prot, uint64_t offset,

We may want to rename this to PalDeviceMap()


pal/src/host/linux-sgx/pal_files.c line 391 at r1 (raw file):

    }
    return ret;
}

The biggest win of this PR: removing the super-complicated file_map() callback from Linux-SGX PAL.

@dimakuv
Copy link
Contributor Author

dimakuv commented Mar 15, 2024

Jenkins, retest Jenkins-SGX-20.04-musl please. Error is in the signing test:

tests.test_sgx_sign.test_sign_from_pem_path (from pytest)

Error Message
cryptography.exceptions.InvalidSignature

Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
Copy link
Contributor

@mwkmwkmwk mwkmwkmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 21 files reviewed, 6 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

a discussion (no related file):
it should be noted that this PR introduces a major regression: writable MAP_SHARED mappings that are actually shared between processes will no longer work correctly



libos/src/bookkeep/libos_vma.c line 1558 at r2 (raw file):

/* This helper function is to refresh access protections on the VMA pages of a given file handle on
 * file-extend operations (currently only `ftruncate`). */
int prot_refresh_mmaped_from_file_handle(struct libos_handle* hdl) {

this iterates over all VMAs we have in memory, not just VMAs belonging to this file; once we fix write to also call this function, this is likely to cause significant performance problems


libos/src/fs/libos_fs_util.c line 255 at r2 (raw file):

}

int generic_truncate(struct libos_handle* hdl, file_off_t size) {

calling the new hook only in truncate is not enough — file length can also be extended by a write (indeed, it is the more common case)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants