Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate MPI thread support levels into Rust Send and Sync #12

Open
bsteinb opened this issue Oct 1, 2016 · 5 comments
Open

Translate MPI thread support levels into Rust Send and Sync #12

bsteinb opened this issue Oct 1, 2016 · 5 comments
Assignees

Comments

@bsteinb
Copy link
Collaborator

bsteinb commented Oct 1, 2016

MPI defines several levels of thread support that indicate "how thread safe" the currently initialized MPI environment is. This level is only known at run-time and could be changed by re-initializing MPI (?).

In Rust, the thread safety of an API is defined by having its types implement the Send and Sync traits, which is a compile-time contract.

These traits are automatically derived by the compiler which currently leads to wrong results for rsmpi, e.g.:

  • when building against MPICH which uses integers for its handle types, the Rust compiler always deem them safe to Send and Sync,
  • when building against OpenMPI which uses pointers for its handle types, the Rust compiler will always deem them not safe to Send and Sync.

Plan: Add markers to rsmpi types so they can have variants that statically are or are not thread safe and have sources such as world() or equivalent_type() return enum values that return one of those variants depending on the current level of thread support of the environment.

@bsteinb bsteinb self-assigned this Oct 1, 2016
@Rufflewind
Copy link
Contributor

Could perhaps add dummy types to represent thread-safety guarantees:

pub mod thread {

// !Send + !Sync
pub struct Funneled(PhantomData<*const ()>);

// Send + !Sync
pub struct Serialized(PhantomData<Cell<()>>);

// Send + Sync
pub struct Multiple(());

}

// marker trait that promises Self has correct Send/Sync semantics
pub unsafe trait Thread {
    fn raw_value() -> libc::c_int;
}

unsafe impl Thread for thread::Funneled { /* ... */ }
unsafe impl Thread for thread::Serialized { /* ... */ }
unsafe impl Thread for thread::Multiple { /* ... */ }

The rest of the API would then parametrize over these dummy types, e.g.

pub struct Universe<M> {
    buffer: Option<Vec<u8>>,
    thread: Phantom<M>,
}

pub enum InitializeError {
    AlreadyInitialized,
    InsufficientThreadSupport,
}

pub fn initialize<M: Thread>() -> Result<Universe<M>, InitializeError> {
    if is_initialized() {
        Err(AlreadyInitialized)
    } else {
        let mut provided: c_int = unsafe { mem::uninitialized() };
        let required = M::raw_value();
        unsafe {
            ffi::MPI_Init_thread(ptr::null_mut(),
                                 ptr::null_mut(),
                                 required,
                                 &mut provided);
        }
        if provided >= required {
            Ok(Universe { buffer: None, thread: PhantomData })
        } else {
            unsafe {
                ffi::MPI_Finalize();
            }
            Err(InsufficientThreadSupport)
        }
    }
}

impl<M> Universe<M> {
    fn world(&self) -> SystemCommunicator<&Self> { /* ... */ }
}

@ExpHP
Copy link

ExpHP commented Aug 8, 2018

It seems apparent to me that a properly sound way to do this will require massive restructuring of the API and careful consideration of every facet:


If we want Send to have any meaning, other threads can't just freely obtain MPI types from thin air.
This means SystemCommunicator::world needs to go away (or be restricted), and there are probably others like this.

If we want Sync to have any meaning, most trivial Clone impls must be unavailable when using Serialized.

Any available methods of producing MPI types out of thin air must be incapable of producing more than one instance wrapping the same raw value, or else it circumvents the lack of Clone. (I think this is already true for initialize? But it would also impact SystemCommunicator::world once again, if we wanted to keep it around in some limited fashion)

Utilizing the full potential of Threading::Serialized is tricky! @Rufflewind's Send + !Sync sounds reasonably correct (assuming we remove all Clone impls), but it's also pretty conservative, because once you send something to another thread, it's gone. Perhaps many of the traits could have an impl for &mut T where T: Trait, exploiting the borrow checker to validate a larger subset of possible usage (such as scoped threads).

  • On further reflection: A blanket impl on &mut T sounds weird when all the methods are &self (but a blanket impl on &T is incorrect for Threading::Serialized!). Maybe most methods could take &mut self, and then trait impls on &T could be made available to modes other than Serialized. But that creates another huge area in design space to explore.

@AndrewGaspar
Copy link
Contributor

I've been thinking about this a bit.

I think, without a doubt, addressing this strictly for Funneled and Multiple requires plumbing borrowing of the "universe" through to each handle type that can invoke MPI.

However, it seems to me like the only way to expose Serialized in a powerful/useful way to the Rust borrow checker is to break apart the "handle" types from their ownership of the "universe". This would, of course, then require that a borrow of the "universe" handle is passed in to any function that calls MPI. This would drastically change the API, which makes me skeptical of it, but perhaps there's a way to do this that only imposes the burden on Serialized users?

Additionally, I think we should provide an optional streamlined approach to a couple things.

First, it would be useful to expose a dynamically checked "Universe" implementation. i.e. a universe that appears to Send + Sync at compile time, but is runtime-checked to ensure the user doesn't send a Funneled universe to a different thread, or try to invoke a Serialized universe from two threads concurrently. This way a code can choose its desired MPI threading model at runtime without requiring a recompile.

Secondly, and perhaps could be folded into the prior feature, it would be useful if we could give users the ability to instantiate an optional, global, library-managed universe. This would make it possible to assume a 'static lifetime for the universe. The user would have to explicitly call an initialize and finalize routines. We could also offer something like an equivalent of #[tokio::main] as an alternative. Of course, having a 'static dynamically managed universe would require guards in every routine that invokes MPI to check if the universe is still alive.

@AndrewGaspar
Copy link
Contributor

Possibly could do something like this only for "handle objects" returned by a Serialized universe:

let universe = tokio::sync::Mutex::new(mpi::initialize_with_threading::<Serialized>()?);
// borrow the universe mutex for passing to tokio tasks
let universe = &universe;

// get request objects some way
let requests: Vec<mpi::serialized::Request> = ...;

// using tokio as an example - imagine async/await is supported. tokio::spawn requires a `Send` future.
// This works because mpi::serialized::Request is Send, even though universe is !Send. It no
// longer carries an implied access to the Universe.
requests.into_iter().map(|r| tokio::spawn(async move {
    // attach the universe to the request, which get us the normal Request we expect
    let universe = universe.lock().await;
    let r = r.with_universe(universe);

    // wait for the request to complete
    r.await;

    // universe released and another task can acquire it now
});

I don't love this because it causes a divergence between the Serialized universes and other universes. But I do like that it allows the Funneled and Multiple universes to have a simpler usage pattern.

Any thoughts?

@adamreichold
Copy link
Contributor

Any thoughts?

I think the type-state approach to Universe is definitly nice and Send+Sync could be modeled on top of that. I do also think that passing e.g. &universe into equivalent_datatype and similar API is worth it as a price for compile-time checked thread safety.

Concerning the Serialized threading level, I would suggest turning this into an internal runtime check: This does add some overhead to the bindings, but it is probably the same overhead that a MPI implementation supporting Multiple will have internally anyway. Via the type-state of Universe this runtime check could be a no-op for Funneled and Multiple where the compile-time guarantees via Send and Sync on Universe are sufficient. I guess, this would basically mean building the above lock into our Universe<Serialized> implementation.

(It should be possible to reduce the overhead as we do not need full locking, we just need to detect concurrent access and panic, we do not need to enforce it via synchronisation. All "dependent" handle types could also access that shared state as global data since there can only be one universe per process. This would however still mean that their type needs to be parameterised over the threading level AFAIU.)

One other wrinkle I see is the Single threading level: Can we really allow this for a safe init function as we have no control over threading? I think we would either need to make initialisation using the Single level unsafe as a way for the user to assert the necessary threading guarantee or outright not support Single at all.

Finally, the API should probably encourage making an explicit choice here, probably suggesting Funneled if possible or Multiple if required and supported and Serialised only if the implementation does not have the necessary synchronisation built in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants