Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy constructor within spawn not outlined #91

Open
ehein6 opened this issue Jul 23, 2019 · 4 comments
Open

Copy constructor within spawn not outlined #91

ehein6 opened this issue Jul 23, 2019 · 4 comments

Comments

@ehein6
Copy link

ehein6 commented Jul 23, 2019

When a cilk_spawn expression is outlined into a function, all sub-expressions are moved into the outlined function. For example, in cilk_spawn foo(x + y), the values of x and y are passed as arguments to the outlined function, which adds them together before calling foo. Nested function calls are also handled in this way.

Now consider the case where one of the arguments to the spawned function is an object, passed by value. An implicit call to the object's copy-constructor occurs here. I would have expected to see the call to the copy-constructor occur within the outlined function, but in Tapir it happens before the call to the outlined function. Is this intentional? Why the inconsistent behavior?

Example:

volatile long * marker;
#include <cilk/cilk.h>
#define noinline __attribute__((noinline))
struct object {
    long _value;
    noinline object(long value) : _value(value) {
        *marker = 0xC;
    }
    noinline object(const object& other) {
        *marker = 0xCC;
    }
};
noinline long add(long x, long y) { return x + y;}
noinline void child(long a, long b, object c)
{
    *marker += a + b + c._value;
}
noinline long parent(long x) 
{
    object obj(0xEE);
    for (long i = 0; i < 100; ++i){
        // *** LOOK AT THIS SPAWN ***
        // Note that obj is passed by value
        cilk_spawn child(x+i, add(x, i), obj);
        obj._value++;
    }
    cilk_sync;
    return *marker;
}

Compiled with -g0 -fcilkplus -std=c++14 -O1 -emit-llvm -mllvm -debug-abi-calls. Here's the demangled IR for the outlined function. Notice the evaluation of x + i and the call to add have been outlined as expected, but the copy constructor is not here.

define internal fastcc void @_Z6parentl_det.achd.cilk(i64 %x.cilk, i64 %i.07.cilk, %struct.object* nocapture readonly align 8 %agg.tmp.cilk) unnamed_addr #2 {
  %__cilkrts_sf = alloca %struct.__cilkrts_stack_frame, align 8
  call fastcc void @__cilkrts_enter_frame_fast_1(%struct.__cilkrts_stack_frame* nonnull %__cilkrts_sf)
  call fastcc void @__cilkrts_detach(%struct.__cilkrts_stack_frame* nonnull %__cilkrts_sf)
  %call.cilk = call i64 @add(long, long)(i64 %x.cilk, i64 %i.07.cilk)
  %add.cilk = add nsw i64 %i.07.cilk, %x.cilk
  call void @child(long, long, object)(i64 %add.cilk, i64 %call.cilk, %struct.object* nonnull %agg.tmp.cilk)
  call fastcc void @__cilk_parent_epilogue(%struct.__cilkrts_stack_frame* nonnull %__cilkrts_sf)
  ret void
}

Here's the call to the copy constructor at the call site:

for.body:                                         ; preds = %det.cont, %entry
  %i.07 = phi i64 [ 0, %entry ], [ %inc1, %det.cont ]
  call void @object::object(object const&)(%struct.object* nonnull %agg.tmp, %struct.object* nonnull dereferenceable(8) %obj)
  call void asm sideeffect "stmxcsr $0\0A\09fnstcw $1", "*m,*m,~{dirflag},~{fpsr},~{flags}"(i32* %1, i16* %2) #7          %9 = call i8* @llvm.frameaddress(i32 0)
  store volatile i8* %9, i8** %4, align 8
  %10 = call i8* @llvm.stacksave()
  store volatile i8* %10, i8** %5, align 8
  %11 = call i32 @llvm.eh.sjlj.setjmp(i8* %6) #9
  %12 = icmp eq i32 %11, 0
  br i1 %12, label %for.body.split, label %det.cont

for.body.split:                                   ; preds = %for.body
  call fastcc void @_Z6parentl_det.achd.cilk(i64 %x, i64 %i.07, %struct.object* nonnull %agg.tmp)
  br label %det.cont
@neboat
Copy link
Collaborator

neboat commented Jul 24, 2019

Thanks for pointing out this issue.

I was in the midst of crafting a long response about why Cilk evaluates all function arguments before the spawn and why your example contains a subtlety — compiler optimizations at -O1 — that makes the behavior appear inconsistent. But thinking about it further, I think your example gets at something deeper concerning object construction and destruction with Cilk and Tapir.

Right now is a particularly busy time for me, but I'll try to get back to you soon.

@ehein6
Copy link
Author

ehein6 commented Jul 24, 2019

Thanks for the quick response, I appreciate any time you can devote to this. Some more thoughts:

The key question is whether child is guaranteed to have a private copy of obj after the detach. I'd argue that it should, to preserve the semantics of pass-by-value during a spawn.

Here's my mental model of what should happen:

  1. Parent calls the outlined function det.achd
  2. det.achd gets ready to call child, then does a detach.
  3. After this point, other workers are free to steal parent's stack frame and continue. This is safe because any arguments passed to child live on det.achd's stack, having been initialized in step 2.

From the IR, it looks like all the detached children will share a single copy of obj created on the stack of parent. But I must be missing something, because on x86 I never observe a race condition bug here.

I've verified that the behavior for this example is the same at -O3.

@neboat
Copy link
Collaborator

neboat commented Aug 10, 2019

(Finally getting back to this question after a major deadline on my end, and before my next major deadline in a week.)

I agree with your mental model of what should happen in this case. I'm working through a fix, but it might take a while to implement and test.

@ehein6
Copy link
Author

ehein6 commented Oct 8, 2019

Here's an interesting wrinkle: If I give object a user-defined destructor, then suddenly Tapir puts the alloca, constructor call, and destructor call in the outlined function like I expect.

struct object {
    long _value;
    noinline object(long value) : _value(value) {
        *marker = 0xC;
    }
    noinline object(const object& other) {
        *marker = 0xCC;
    }
    noinline ~object() {
        *marker = 0xD;
    }
};
// ...
// The rest is unchanged
; Function Attrs: noinline nounwind uwtable
define internal fastcc void @_Z6parentl_invoke.cont.cilk(i64 %x.cilk, i64 %i.015.cilk, %struct.object* align 8 %obj.cilk) unnamed_addr #2 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
for.body.cilk:
  %__cilkrts_sf = alloca %struct.__cilkrts_stack_frame, align 8
  call fastcc void @__cilkrts_enter_frame_fast_1(%struct.__cilkrts_stack_frame* nonnull %__cilkrts_sf)
  call fastcc void @__cilkrts_detach(%struct.__cilkrts_stack_frame* nonnull %__cilkrts_sf)
  %agg.tmp.cilk = alloca %struct.object, align 8
  %0 = bitcast %struct.object* %agg.tmp.cilk to i8*
  call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %0)
  %call.cilk = call i64 @add(long, long)(i64 %x.cilk, i64 %i.015.cilk)
  %add.cilk = add nsw i64 %i.015.cilk, %x.cilk
  call void @object::object(object const&)(%struct.object* nonnull %agg.tmp.cilk, %struct.object* nonnull dereferenceable(8) %obj.cilk)
  call void @child(long, long, object)(i64 %add.cilk, i64 %call.cilk, %struct.object* nonnull %agg.tmp.cilk)
  call void @object::~object()(%struct.object* nonnull %agg.tmp.cilk) #7
  call fastcc void @__cilk_parent_epilogue(%struct.__cilkrts_stack_frame* nonnull %__cilkrts_sf)
  ret void
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants