Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add slice access #634

Merged
merged 3 commits into from Sep 27, 2019
Merged

Add slice access #634

merged 3 commits into from Sep 27, 2019

Conversation

willglynn
Copy link
Contributor

This PR adds slice access to certain storages. Slices allow the user to bolt on an accelerator framework like OpenCL (#553) for high throughput component processing. My thoughts behind this particular design are #553 (comment) and #553 (comment).

Checklist

  • I've added tests for all code changes and additions (where applicable)
  • I've added a demonstration of the new feature to one or more examples
  • I've updated the book to reflect my changes
  • Usage of new public items is shown in the API docs

API changes

Non-breaking, strictly additive:

  • Add DenseVecStorage<T>::as_slice() -> &[T] and as_mut_slice() -> &mut [T].
  • Add VecStorage::unsafe_slice() -> &[T] and unsafe_mut_slice() -> &mut [T].
  • Add DefaultVecStorage<T>, which works like VecStorage<T>, requires T: Default, and provides as_slice() -> &[T] and as_mut_slice() -> &mut [T].
  • Add as or unsafe [_mut]_slice directly on Storage as appropriate.

@WaDelma
Copy link
Member

WaDelma commented Aug 28, 2019

I think those unsafe_slices should return raw pointers as this is constructing partially initialized slice which I think is UB. Not 100% sure about this though.

@willglynn
Copy link
Contributor Author

The proposed changeset allows the user to call an unsafe function to obtain a slice containing uninitialized or dropped values. Whether or not that slice should be permitted to exit the caller's unsafe {} code is a good question, and one which depends on the particular component's representation in memory, but either way it doesn't seem wrong for specs to provide this kind of slice to the caller's unsafe {} code.

@WaDelma
Copy link
Member

WaDelma commented Aug 28, 2019

Well if the whole Vec is full then it would be safe yes. But it's practically useless. I would provide way of getting raw pointers and let the user construct slice out of it if they want.

@willglynn
Copy link
Contributor Author

I don't understand why this is wrong:

impl<T> UnsafeSliceAccess<T> for VecStorage<T> {
    unsafe fn unsafe_slice(&self) -> &[T] {
        self.0.as_slice()
    }
}

As I understand it, constructing a slice isn't UB regardless of its pointer or length.

Accessing such a slice may or may not be UB, but unsafe code commonly involves nonlocal concerns. Even in this one line function, Vec::as_slice() is safe code, but here it is unsafe because VecStorage uses unsafe { Vec::set_len() } and unsafe { ptr::drop_in_place() } elsewhere. You and I know that the slice is unsafe to use, but Vec::as_slice() doesn't.

unsafe_slice() gives users a slice with which they can cause UB, but… that's why it's unsafe. How would returning a pointer and a length improve this? As it stands now, unsafe_slice() returns a slice with a pointer and a length and the correct lifetime, which seems strictly safer than forcing users to construct their own slices.

@WaDelma
Copy link
Member

WaDelma commented Aug 28, 2019

If you have VecStorage<bool> that contains uninitialised values, this would create reference to slice of booleans that have different values than 0 or 1 and I think having such reference might be UB. But I cannot be certain as there is no memory model yet. What I am certain that having &bool to uninitialized bool is insta UB (and so &T of uninitialised value is easily UB). (NOTE: One could object here that bool is quite specific, but this problems is with enums and anything that contains bools or enums)

I might be bit too paranoid here, but unsafe is quite hard to reason about.

https://www.reddit.com/r/rust/comments/95vxdy/understanding_ub_with_stdmemuninitialized/

rust-lang/rust#53491

@willglynn
Copy link
Contributor Author

willglynn commented Aug 28, 2019

Right; accessing an uninitialized &bool is definitely UB. My understanding is that constructing a slice is never UB; a &[bool] is not a &bool, it's syntactic sugar for a struct with a ptr: *const bool and len: usize. Calling as_ptr() and len() is never UB, since they just return the values inside the slice. UB begins only when that pointer is used to produce a &bool, i.e. on get() and the like.

We agree that blindly accessing VecStorage<bool>::unsafe_slice() could definitely cause UB. That's why it's unsafe. However, the user could take that unsafe_slice(), check the mask() for each index, and access only the locations where the slice contains initialized data. This would be safe, and it's basically how VecStorage works right now.

Moreover, this isn't VecStorage<bool>, it's VecStorage<T>. T could be a type for which every bit pattern is valid, in which case uninitialized data is not a safety problem. This is the motivating use case – a VecStorage containing a #[repr(C)] struct of f32s intended for computation in an OpenCL kernel. If every bit pattern of T is a T, there is no UB even for uninitialized data, and thus no need to consult the mask().

@willglynn
Copy link
Contributor Author

I wonder: is this whole discussion because VecStorage should be storing a Vec<MaybeUninit<T>> instead of a Vec<T>? I'd be totally happy with returning a &[MaybeUninit<T>] instead, and that would make as_slice() safe to boot.

@willglynn
Copy link
Contributor Author

I pushed a separate branch with 9c2d800 showing what VecStorage+MaybeUninit would look like. fn as_slice(&self) -> &[MaybeUninit<T>] feels a lot better than unsafe fn unsafe_slice(&self) -> &[T], and it feels better for VecStorage's internals to work in terms of MaybeUninit. There's no API breakage with that change, but a MaybeUninit dependency would bump the minimum Rust from 1.34.0 to 1.36.0.

If this is the right direction but we don't want to increase the version requirement right now, I would be happy to keep this new unified SliceAccess design and split all the MaybeUninit and impl SliceAccess for VecStorage into a separate PR. DefaultVecStorage entirely sidesteps the initialization concern and is only marginally more expensive.

@willglynn
Copy link
Contributor Author

I rearranged my changes and force-pushed the branch for this PR. The first commit adds as_slice()/as_mut_slice() to DenseVecStorage and adds DefaultVecStorage. This is sufficient for my use (and likely others' use) per #553.

The second commit reworks VecStorage to use MaybeUninit per my previous comment and makes it expose as_slice() -> &[MaybeUninit<T>]. These changes are related (it would be nice to slice VecStorage), but they are separable, and I'm happy to move that commit to a new PR if desired.

@WaDelma
Copy link
Member

WaDelma commented Aug 29, 2019

Note that for f32 every bit pattern isn't allowed: Signaling NaN is a UB.
EDIT: Ah actually it isn't: https://stackoverflow.com/questions/43812361/is-transmuting-bytes-to-a-float-safe-or-might-it-produce-undefined-behavior I read somewhere before that it is, but apparently it has changed or I just read some misinformation.

I am not sure if slice is just syntactic sugar from compilers POV. But yeah probably constructing slice isn't UB.

I think having MaybeUninit is good direction to go to as it gives more confidence on the correctness of this stuff. I am not against bumping the minimum rust version myself, but cannot say if that's for others.

@willglynn
Copy link
Contributor Author

willglynn commented Aug 29, 2019

I think having MaybeUninit is good direction to go to as it gives more confidence on the correctness of this stuff. I am not against bumping the minimum rust version myself, but cannot say if that's for others.

👍

@WaDelma
Copy link
Member

WaDelma commented Sep 27, 2019

Related: #646

@willglynn
Copy link
Contributor Author

I note in #646 (comment) why I now believe that slices must always refer to initialized elements in current Rust, contrary to my earlier comments here.

@WaDelma
Copy link
Member

WaDelma commented Sep 27, 2019

This PR looks otherwise good, but DenseVecStorage should probably be changed to also have MaybeUninit.

@willglynn
Copy link
Contributor Author

willglynn commented Sep 27, 2019

Ah, the Vec<T> is dense and always initialized, it's the the Vec<Index>es which are not. On it.

src/storage/storages.rs Outdated Show resolved Hide resolved
This vector contains uninitialized data, so it should use MaybeUninit.
@WaDelma
Copy link
Member

WaDelma commented Sep 27, 2019

bors r+

bors bot added a commit that referenced this pull request Sep 27, 2019
634: Add slice access r=WaDelma a=willglynn

This PR adds slice access to certain storages. Slices allow the user to bolt on an accelerator framework like OpenCL (#553) for high throughput component processing. My thoughts behind this particular design are #553 (comment) and #553 (comment).

## Checklist

* [x] I've added tests for all code changes and additions (where applicable)
* [x] I've added a demonstration of the new feature to one or more examples
* [x] I've updated the book to reflect my changes
* [x] Usage of new public items is shown in the API docs

## API changes

Non-breaking, strictly additive:

* Add `DenseVecStorage<T>::as_slice() -> &[T]` and `as_mut_slice() -> &mut [T]`.
* Add `VecStorage::unsafe_slice() -> &[T]` and `unsafe_mut_slice() -> &mut [T]`.
* Add `DefaultVecStorage<T>`, which works like `VecStorage<T>`, requires `T: Default`, and provides `as_slice() -> &[T]` and `as_mut_slice() -> &mut [T]`.
* Add `as` or `unsafe` `[_mut]_slice` directly on `Storage` as appropriate.

Co-authored-by: Will Glynn <will@willglynn.com>
@bors
Copy link
Contributor

bors bot commented Sep 27, 2019

Build succeeded

@bors bors bot merged commit d79e603 into amethyst:master Sep 27, 2019
@willglynn willglynn deleted the slice_access branch September 27, 2019 21:03
bors bot added a commit to amethyst/hibitset that referenced this pull request Feb 16, 2020
53: Add slice access r=azriel91 a=willglynn

I added slice access to `specs` (amethyst/specs#634) to support OpenCL (amethyst/specs#553). Usage looks like:

```rust
impl<'a> System<'a> for SomeSystem {
    type SystemData = (WriteStorage<'a, Something>, );

    fn run(&mut self, (mut some_storage, ): Self::SystemData) {
        // 1. get a mutable slice of the component storage
        let slice = some_storage.unprotected_storage_mut().as_mut_slice();

        // 2. process the slice with OpenCL or whatever

        // 3. there is no step 3
    }
}
```

Components can be processed using an OpenCL kernel:

```c
struct Something {
    float foo;
    float bar;
    float baz;
};

__kernel void process_something(__global struct Something* slice) {
    __global struct Something* something = &slice[get_global_id(0)];
    // do stuff with this data
}
```

There's a catch when using slices from `DefaultVecStorage<T>` or `VecStorage<T>`. The slices are sparse, which means the kernel will either uselessly process `T::default()` or perform unchecked access into a `MaybeUninit<T>` for any entity which does not have component `T`.

Exposing a slice from `hibitset::BitSet` would allow an OpenCL kernel check the storage mask itself:

```c
struct Something {
    float foo;
    float bar;
    float baz;
};

#define BITS_PER_SIZE_T (sizeof(size_t) * 8)

__kernel void process_something(__global size_t* bitset_layer0, __global struct Something* slice) {
    // check that this index contains data
    size_t mask_index = get_global_id(0) / BITS_PER_SIZE_T;
    size_t shift = get_global_id(0) % BITS_PER_SIZE_T;
    if ((bitset_layer0[mask_index] >> shift) & 1 == 0) {
        return;
    }
  
    __global struct Something* something = &slice[get_global_id(0)];
    // do stuff with this data
}
```

This PR adds `layer{0,1,2}_as_slice(&self) -> &[usize]` to `BitSet`, as well as trivial constants which make correct usage more obvious. Layer 3 is specified to be a single `usize` so I didn't see any benefit to exposing it as a slice.

Co-authored-by: Will Glynn <will@willglynn.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants