Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CIRGen] Support for dereferencing void pointers #595

Merged
merged 5 commits into from May 18, 2024

Conversation

PragmaTwice
Copy link
Contributor

@PragmaTwice PragmaTwice commented May 9, 2024

In this PR, we support for dereferencing void pointers as a GNU C extension. This include two modification:

  • In CIRGen, we support to build ReturnStmt with void return type.
  • In LowerToLLVM, we support to lower CIR load with void result type to LLVM.

It's a part of #579, since I would like to split it to two tasks:

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this, few comments inline

clang/lib/CIR/CodeGen/CIRGenStmt.cpp Outdated Show resolved Hide resolved
@@ -864,6 +864,13 @@ class CIRLoadLowering : public mlir::OpConversionPattern<mlir::cir::LoadOp> {
mlir::LogicalResult
matchAndRewrite(mlir::cir::LoadOp op, OpAdaptor adaptor,
mlir::ConversionPatternRewriter &rewriter) const override {
// if the result type is void, we can just remove the load operation since
// it's useless
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks too specific, what are you really trying to avoid here? The expression should be evaluated, so we should still see some behavior. Is LLVM not generating the load or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, seems load void in LLVM IR will lead to a module verification error like below:

loc("../../clang/test/CIR/CodeGen/pointer-arith-ext.c":73:36): error: 'llvm.load' op result #0 must be LLVM type with size, but got '!llvm.void'
fatal error: error in backend: The pass manager failed to lower CIR to LLVMIR dialect!

So I think maybe we can just remove the load instruction since it's not valid in LLVM IR and also seems no side effect.
But feel free to give any suggestion since I'm not familiar with CIR now : )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the expression needs to be evaluated, we need to execute the load, even though we might not use the result. I suggest you:

  • Bitcast the load address from void* to i8*
  • Build the LLVM load op returning an i8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your explanation! I will fix this part.

But just for confirmation, it seems clang didn't translate the void pointer dereferencing to any load instruction in LLVM.
I've made an example on godbolt here: https://godbolt.org/z/8hjPb5jYE

  %add.ptr = getelementptr inbounds i8, ptr %0, i64 %1
  ret void

For return *(a + n), the clang dumps just GEP followed by ret void.
I wonder if we follow such behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example might be misleading: It's possible CIRGen is creating a return slot, loading from it but not returning anything, which creates a load that doesn't exist in OG codegen. Try crafting an example where the result actually returns something, might provide more useful insights.

Anyways, if there's a load that returns void, we should make that up as an i8 thing. If we are generating worse code than LLVM, that's another matter likely not related with what you are fixing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your valuable suggestions! @bcardosolopes

I've spend some time on this issue, and found something interesting on both clang CG and CIRGen code.

Seems in clang codegen, the value of void pointer deference is ignored here:

Value *VisitUnaryDeref(const UnaryOperator *E) {
if (E->getType()->isVoidType())
return Visit(E->getSubExpr()); // the actual value should be unused
return EmitLoadOfLValue(E);
}

And also in CIRGen, the logic is same (and because of it we don't generate cir.load !void on void pointer deference):

mlir::Value VisitUnaryDeref(const UnaryOperator *E) {
if (E->getType()->isVoidType())
return Visit(E->getSubExpr()); // the actual value should be unused
return buildLoadOfLValue(E);
}

So I think maybe we can just follow the behavior of clang? (or, in CIR we can emit a cir.load !void to keep such information, but in LLVM IR we can discard it?)

Also as your suggestion, I try to build an example that utilizes the result of void pointer deference, but the only valid (which can pass the clang sema checker) example I can figure out is something like return &*void_ptr. And I've tested it and seems it works well even we don't generate LLVM IR load in matchAndRewrite(mlir::cir::LoadOp op, ...). (something like return *void_ptr, 1 that don't use the deref result but does return a value seems also work well)

Kindly ask if you have any idea or suggestion on this : )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think maybe we can just follow the behavior of clang?

Yep. We don't need to change anything in CIRGen.

I try to build an example that utilizes the result of void pointer deference

Right, it's not really a thing. I was expecting that you'd do some ptr casting before just to see if in those cases we'd also get the extra load.

Kindly ask if you have any idea or suggestion on this

We need to find out why that load doesn't show up in OG codegen, if the Load is being folded by CreatedLoad when it finds out it's loading from a void ptr, that's something we could also do in our createLoad, this would prevent us from adding a workaround in LowerToLLVM. Two ideas:

  • Run OG codegen under the debugger and find out what happends with CreateLoad, or in general why this loading is not showing up.
  • For the sake of experiment, do bitcasts in LowerToLLVM and compare the output with OG LLVM to see what's different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patient guidance!

I think I've found the solution and avoid to change any code in LowerToLLVM.

I added a i8 cast for void pointer load in buildLoadOfScalar so that it can handle this case well (as clang codegen).

Could you review it again?

@bcardosolopes bcardosolopes changed the title [CIR][CIRGen&LowerToLLVM] Support for dereferencing void pointers [CIR][CIRGen][LowerToLLVM] Support for dereferencing void pointers May 9, 2024
@PragmaTwice PragmaTwice changed the title [CIR][CIRGen][LowerToLLVM] Support for dereferencing void pointers [CIR][CIRGen] Support for dereferencing void pointers May 17, 2024
Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! LGTM, thanks!

@bcardosolopes bcardosolopes merged commit 582df2a into llvm:main May 18, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants