New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CIR][CIRGen][LLVMLowering] Vtable support for simple multiple inhertance without thunk #569
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need some further discussions on the typing of vtables. CIRGen currently follows the original clang CodeGen scheme which represents vtables as arrays of pointers. Although most elements in a vtable are indeed pointers, elements such as offset-to-top are not. Representing these elements as pointers requires ugly hack as shown in this commit (i.e. #cir.ptr<-8>
).
let description = [{ | ||
A pointer attribute is a literal attribute that represents an integral | ||
value of a pointer type. | ||
}]; | ||
let builders = [ | ||
AttrBuilderWithInferredContext<(ins "Type":$type, "uint64_t":$value), [{ | ||
AttrBuilderWithInferredContext<(ins "Type":$type, "int64_t":$value), [{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change can be strange. A "negative" pointer value does not make any sense in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand ConstPtrAttr
to be equivalent to llvm.inttoptr
. #cir.ptr<-8>
can be translated as ptr inttoptr (i64 -8 to ptr)
. Or should a new attribute be added to represent relative offset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand ConstPtrAttr to be equivalent to
llvm.inttoptr
Yes, they are the same on the code they eventually generate. But we should focus more on the abstraction they provide, especially when designing an IR that contains high-level semantics. #cir.const_ptr
actually models reinterpret_cast
in C++ that casts integers to pointers. It just does not make sense to cast a negative integer to a pointer, thus the argument to #cir.const_ptr
should be an unsigned integer.
As you pointed out in the comment, the offset-to-top element in a vtable is an integer of type ptrdiff_t
under Itanium C++ ABI. Thus a better design would be using #cir.int<-8> : !T
to represent the offset-to-top element in a vtable. where !T
is the CIR integer type for ptrdiff_t
. But this is impossible now because we use a pointer array to represent a vtable. This is exactly what we need to further discuss with Bruno and other CIR folks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just does not make sense to cast a negative integer to a pointer, thus the argument to #cir.const_ptr should be an unsigned integer.
Intuitivelly it doesn't make much sense, but in practice we find code like that all around, simple example: https://godbolt.org/z/aMdxfvrf3. This is how CIR handles it: https://godbolt.org/z/55fx96z1Y. Note that if we #cir.ptr
was a bit more flexible here we wouldn't need the unary op. There are many ways to represent this, for example, it's possible to keep encoding those as unsigned and make them signed when materializing as part of some use. However, I rather we use a more obvious representation.
In general, though not related to vtables, This could be used for things like tagging pointers with some sort of non-null sentinel, etc. The way to thing about this is that the constant initialization isn't related to the pointee. We should be able to write #cir.ptr<-1 : i64> : !cir.ptr<whatever>
As you pointed out in the comment, the offset-to-top element in a vtable is an integer of type ptrdiff_t under Itanium C++ ABI. Thus a better design would be using #cir.int<-8> : !T to represent the offset-to-top element in a vtable. where !T is the CIR integer type for ptrdiff_t. But this is impossible now because we use a pointer array to represent a vtable. This is exactly what we need to further discuss with Bruno and other CIR folks.
Since the offset-to-top is always a constant, there are a couple of different approaches to consider:
- Teach
#cir.ptr
to accept a MLIR integer instead of a plainuint64_t
. We should do that regardless of this PR. - Change the vtable representation from constant array of pointers to a constant struct of pointers. I need to refresh my itanium-abi-fu but if we need more entries to keep offsets, we might a way for the operation to track them from each other?
- Raise the vtable operation a bit more such that we keep offset info as a MLIR integer in some other param outside
vtable_data
. This allows us to write some nice query like attr.getOffsetFromTop or somethimg. I was expecting the multiple inheritance to force us into improving the design, this is probably one way to go about this.
I'm more inclined towards (3), but would love to see improvements in (1) at some point in case someone wants to tackle it. For (3), we need to change that in another PR, and then use it here once that lands.
Any extra thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intuitivelly it doesn't make much sense, but in practice we find code like that all around, simple example: https://godbolt.org/z/aMdxfvrf3. This is how CIR handles it: https://godbolt.org/z/55fx96z1Y. Note that if we #cir.ptr was a bit more flexible here we wouldn't need the unary op.
OK I can buy this.
I'm more inclined towards (3), ... Any extra thoughts?
The problem of (3) is that a single vtable may contain multiple offset-to-top elements in the presence of multiple inheritance. For example:
struct Left { virtual ~Left(); };
struct Right { virtual ~Right(); };
struct Derived : Left, Right {};
@vtable for Derived = linkonce_odr dso_local unnamed_addr constant { [4 x ptr], [4 x ptr] } {
[4 x ptr] [
ptr null, ;; Offset-to-top from Left to Derived
ptr @typeinfo for Derived,
ptr @Derived::~Derived(),
ptr @Derived::~Derived()
],
[4 x ptr] [
ptr inttoptr (i64 -8 to ptr), ;; Offset-to-top from Right to Derived
ptr @typeinfo for Derived,
ptr @non-virtual thunk to Derived::~Derived(),
ptr @non-virtual thunk to Derived::~Derived()
]
}, comdat, align 8
Thus if offset-to-top values are kept separate from vtable_data
we may need a way to track which base class each of them corresponds to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thus if offset-to-top values are kept separate from vtable_data we may need a way to track which base class each of them corresponds to.
As long as offset-to-top is a list of the same size as vtable_data, all should be good, shouldn't it?
But more that that, we could track multiple lanes just as easy with some other approaches. The vtable right now is still a llvm friendly layout, but it could be anything that makes more sense / easier to use - as long as it could keep lowering to the same LLVM code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this PR attempting to modify #cir.ptr to accept mlir::integer instead of plain uint64_t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Laity000, but as previously mentioned, changing #cir.ptr should be another PR on it own, can you please address it outside of this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, though not related to vtables, This could be used for things like tagging pointers with some sort of non-null sentinel, etc. The way to thing about this is that the constant initialization isn't related to the pointee. We should be able to write
#cir.ptr<-1 : i64> : !cir.ptr<whatever>
Sorry for the delay of a few days. I've created a new PR #598 . Is it possible to directly use the int64_t type as the value of the #cir.ptr
? If using MLIR integer types like #cir.ptr<-1 : i64>
, I understand we should consider the address size based on XX ABI to determine the type, include #cir.ptr<null>
. Can we just not worry about the address size when we create it, and instead get it from the DataLayout
when we use the attribute? @bcardosolopes
auto llvmSrcVal = adaptor.getOperands().front(); | ||
auto llvmDstTy = | ||
getTypeConverter()->convertType(op.getBaseAddr().getType()); | ||
rewriter.replaceOpWithNewOp<mlir::LLVM::BitcastOp>(op, llvmDstTy, | ||
llvmSrcVal); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cir.base_class_addr
is not simply a bitcast. It is expected to perform some pointer arithmetic to convert an object pointer to a base subobject pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip. So, like you mentioned, BaseClassAddrOp
needs to be more abstract in its syntax. For instance, we should also include the baseoffset values for casting to base classes in this attribute. And instead of generating them directly in CIRGen, they should all be handled together during the lowering phase. That's what I've got in mind. Any other situations?
I removed this in the PR, and perhaps it can be implemented separately later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any other situations?
At least 2 approaches exist that provides necessary information to cir.base_class_addr
and allows proper lowering of it:
- Attach
cir.base_class_addr
with a new attribute, e.g.#cir.base_class_addr_info
, that contains the base-to-derived offsets. During lowering prepare, the offsets are extracted from the attribute and used for lowering the operation. - Attach
cir.base_class_addr
with the AST node corresponds to the derived-to-base cast expression. During lowering prepare, execute any necessary AST queries to find the offset for lowering the operation.
An example of the first approach can be found at the implementation of the cir.dyn_cast
operation. The operation carries a #cir.dyn_cast_info
attribute that contains all necessary ABI-specific information for lowering the operation during lowering prepare. While the 2nd approach allows a clearer assembly syntax.
As a side note, cir.base_class_addr
should also handle derived-to-virtual-base casts whose offsets cannot be known statically. Thus we need at least 2 offsets to do the conversion job: 1) the address point in the vtable that corresponds to the vbase offset entry; 2) a fixed offset from the derived object (or vbase subobject) to the target base object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lancern very good explanation!
@Laity000 I suggest you start simple: only handle static offsets for now and assert when that's not the case. Please let's build this incrementally. Both (1) and (2) seems fine to me, but I'm inclined for (1) given that we already raise a bunch of vtable information and we have a use for that in mind in the future.
Thank you for your review. I've reviewed the Itanium C++ ABI document:
So, I'm not sure whether to use ptrdiff_t, ptr, or a new attr for the type of vtable? This might require a further discussion. @Lancern @bcardosolopes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see your first PR here, thanks! Comments inline
let description = [{ | ||
A pointer attribute is a literal attribute that represents an integral | ||
value of a pointer type. | ||
}]; | ||
let builders = [ | ||
AttrBuilderWithInferredContext<(ins "Type":$type, "uint64_t":$value), [{ | ||
AttrBuilderWithInferredContext<(ins "Type":$type, "int64_t":$value), [{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just does not make sense to cast a negative integer to a pointer, thus the argument to #cir.const_ptr should be an unsigned integer.
Intuitivelly it doesn't make much sense, but in practice we find code like that all around, simple example: https://godbolt.org/z/aMdxfvrf3. This is how CIR handles it: https://godbolt.org/z/55fx96z1Y. Note that if we #cir.ptr
was a bit more flexible here we wouldn't need the unary op. There are many ways to represent this, for example, it's possible to keep encoding those as unsigned and make them signed when materializing as part of some use. However, I rather we use a more obvious representation.
In general, though not related to vtables, This could be used for things like tagging pointers with some sort of non-null sentinel, etc. The way to thing about this is that the constant initialization isn't related to the pointee. We should be able to write #cir.ptr<-1 : i64> : !cir.ptr<whatever>
As you pointed out in the comment, the offset-to-top element in a vtable is an integer of type ptrdiff_t under Itanium C++ ABI. Thus a better design would be using #cir.int<-8> : !T to represent the offset-to-top element in a vtable. where !T is the CIR integer type for ptrdiff_t. But this is impossible now because we use a pointer array to represent a vtable. This is exactly what we need to further discuss with Bruno and other CIR folks.
Since the offset-to-top is always a constant, there are a couple of different approaches to consider:
- Teach
#cir.ptr
to accept a MLIR integer instead of a plainuint64_t
. We should do that regardless of this PR. - Change the vtable representation from constant array of pointers to a constant struct of pointers. I need to refresh my itanium-abi-fu but if we need more entries to keep offsets, we might a way for the operation to track them from each other?
- Raise the vtable operation a bit more such that we keep offset info as a MLIR integer in some other param outside
vtable_data
. This allows us to write some nice query like attr.getOffsetFromTop or somethimg. I was expecting the multiple inheritance to force us into improving the design, this is probably one way to go about this.
I'm more inclined towards (3), but would love to see improvements in (1) at some point in case someone wants to tackle it. For (3), we need to change that in another PR, and then use it here once that lands.
Any extra thoughts?
@lanza rebased over the weekend, sorry for the churn. Can you rebase this PR again? |
1717333
to
9a5d0e5
Compare
…tance without thunk This PR adds Vtable support for C++ multiple inheritance without thunk: 1. The `VTableAttr` supports multiple `ArrayAttr`, and the `VTableAddrPointOpLowering` supports multi-vtable lowering 2. To support the `offset-to-top` members of multi-vtable, the current implementation is to make `ConstPtrAttr`accepts a MLIR integer instead of a plain uint64_t. More further discussions by [this comment](llvm#569 (comment)) Example: ```mlir cir.global linkonce_odr @_ZTV5Child = #cir.vtable< {#cir.const_array<[ #cir.ptr<null> : #!cir.ptr<!u8i>, #cir.global_view<@_ZTI5Child> : !cir.ptr<!u8i>, #cir.global_view<@_ZN5Child9MotherFooEv> : !cir.ptr<!u8i>, #cir.global_view<@_ZN6Mother10MotherFoo2Ev> : !cir.ptr<!u8i>]> : !cir.array<!cir.ptr<!u8i> x 4>, #cir.const_array<[ #cir.ptr<-8 : i64> : !cir.ptr<!u8i>, #cir.global_view<@_ZTI5Child> : !cir.ptr<!u8i>, #cir.global_view<@_ZN6Father9FatherFooEv> : !cir.ptr<!u8i>] > : !cir.array<!cir.ptr<!u8i> x 3>}> : !ty_anon_struct3 ```
…tance without thunk This PR adds Vtable support for C++ multiple inheritance without thunk: 1. The `VTableAttr` supports multiple `ArrayAttr`, and the `VTableAddrPointOpLowering` supports multi-vtable lowering 2. To support the `offset-to-top` members of multi-vtable, the current implementation is to make `ConstPtrAttr`accepts a MLIR integer instead of a plain uint64_t. More further discussions by [this comment](llvm#569 (comment)) Example: ```mlir cir.global linkonce_odr @_ZTV5Child = #cir.vtable< {#cir.const_array<[ #cir.ptr<null> : #!cir.ptr<!u8i>, #cir.global_view<@_ZTI5Child> : !cir.ptr<!u8i>, #cir.global_view<@_ZN5Child9MotherFooEv> : !cir.ptr<!u8i>, #cir.global_view<@_ZN6Mother10MotherFoo2Ev> : !cir.ptr<!u8i>]> : !cir.array<!cir.ptr<!u8i> x 4>, #cir.const_array<[ #cir.ptr<-8 : i64> : !cir.ptr<!u8i>, #cir.global_view<@_ZTI5Child> : !cir.ptr<!u8i>, #cir.global_view<@_ZN6Father9FatherFooEv> : !cir.ptr<!u8i>] > : !cir.array<!cir.ptr<!u8i> x 3>}> : !ty_anon_struct3 ```
@@ -248,18 +248,26 @@ def FPAttr : CIR_Attr<"FP", "fp", [TypedAttrInterface]> { | |||
|
|||
def ConstPtrAttr : CIR_Attr<"ConstPtr", "ptr", [TypedAttrInterface]> { | |||
let summary = "Holds a constant pointer value"; | |||
let parameters = (ins AttributeSelfTypeParameter<"">:$type, "uint64_t":$value); | |||
let parameters = (ins AttributeSelfTypeParameter<"">:$type, "mlir::IntegerAttr":$value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it's the same change I suggested in the other PR, please incorporate this bit over there instead.
delete c; | ||
|
||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this build all the way to LLVM? Would be nice to have LLVM tests right here as well (see other test for examples) and already include the LLVM lowering support needed (if any is necessary), so we can make sure that we are matching what's expected out from C++ source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks almost good! Thanks
…tance without thunk This PR adds Vtable support for C++ multiple inheritance without thunk: 1. The `VTableAttr` supports multiple `ArrayAttr`. 2. The `VTableAddrPointOpLowering` supports multi-vtable lowering Example: ```mlir cir.global linkonce_odr @_ZTV5Child = #cir.vtable< {#cir.const_array<[ #cir.ptr<null> : #!cir.ptr<!u8i>, #cir.global_view<@_ZTI5Child> : !cir.ptr<!u8i>, #cir.global_view<@_ZN5Child9MotherFooEv> : !cir.ptr<!u8i>, #cir.global_view<@_ZN6Mother10MotherFoo2Ev> : !cir.ptr<!u8i>]> : !cir.array<!cir.ptr<!u8i> x 4>, #cir.const_array<[ #cir.ptr<-8 : i64> : !cir.ptr<!u8i>, #cir.global_view<@_ZTI5Child> : !cir.ptr<!u8i>, #cir.global_view<@_ZN6Father9FatherFooEv> : !cir.ptr<!u8i>] > : !cir.array<!cir.ptr<!u8i> x 3>}> : !ty_anon_struct3 ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for applying all the reviews. LGTM (but see minor comment)
// CIR: } | ||
|
||
// LLVM-DAG: define linkonce_odr void @_ZN6MotherC2Ev(ptr %0) | ||
// LLVM-DAG: store ptr getelementptr inbounds ({ [4 x ptr] }, ptr @_ZTV6Mother, i32 0, i32 0, i32 2), ptr %{{[0-9]+}}, align 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like in OG codegen we get store ptr getelementptr inbounds inrange(-16, 16) ({ [4 x ptr] }, ptr @_ZTV6Mother, i32 0, i32 0, i32 2), ptr %3, align 8
. After this lands, can you work on adding inrange(-16, 16)
support in another PR?
This PR adds Vtable support for C++ multiple inheritance without thunk. This change contains the CIR codegen and lowering work:
VTableAttr
should allow adding multipleArrayAttr
for multi-inheritance.VTableAddrPointOpLowering
has been fixed for the multi-vtable during the MLIR lowering phase.Example: