Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo][LoopLoadElim] Fix missing debug location updates #91839

Merged
merged 3 commits into from May 17, 2024

Conversation

Apochens
Copy link
Contributor

Fix #91836 .

For LoopLoadElim-L447 (Initial), its debug location should be dropped, because it is inserted in the loop's preheater, which is out of the loop. Since Initial has an empty debug location when being created, I just add a comment to explicitly point out this case and do nothing with its debug location.

All the three updates are checked in the new test.

@llvmbot
Copy link
Collaborator

llvmbot commented May 11, 2024

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-llvm-transforms

Author: Shan Huang (Apochens)

Changes

Fix #91836 .

For LoopLoadElim-L447 (Initial), its debug location should be dropped, because it is inserted in the loop's preheater, which is out of the loop. Since Initial has an empty debug location when being created, I just add a comment to explicitly point out this case and do nothing with its debug location.

All the three updates are checked in the new test.


Full diff: https://github.com/llvm/llvm-project/pull/91839.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp (+11-1)
  • (added) llvm/test/Transforms/LoopLoadElim/update_debugloc_store_forwarded.ll (+85)
diff --git a/llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp b/llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
index f611ef6b2fa2d..60274f01f747f 100644
--- a/llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
@@ -448,6 +448,10 @@ class LoadEliminationForLoop {
         new LoadInst(Cand.Load->getType(), InitialPtr, "load_initial",
                      /* isVolatile */ false, Cand.Load->getAlign(),
                      PH->getTerminator()->getIterator());
+    // We don't give any debug location to Initial, because it is inserted
+    // into the loop's preheader. A debug location inside the loop will cause
+    // a misleading stepping when debugging. The test preserving-debugloc-store
+    // -forwarded.ll checks this.
 
     PHINode *PHI = PHINode::Create(Initial->getType(), 2, "store_forwarded");
     PHI->insertBefore(L->getHeader()->begin());
@@ -462,14 +466,20 @@ class LoadEliminationForLoop {
            "The type sizes should match!");
 
     Value *StoreValue = Cand.Store->getValueOperand();
-    if (LoadType != StoreType)
+    if (LoadType != StoreType) {
       StoreValue = CastInst::CreateBitOrPointerCast(StoreValue, LoadType,
                                                     "store_forward_cast",
                                                     Cand.Store->getIterator());
+      // Because it casts the old `load` value and is used by the new `phi`
+      // which replaces the old `load`, we give the `load`'s debug location
+      // to it.
+      cast<Instruction>(StoreValue)->setDebugLoc(Cand.Load->getDebugLoc());
+    }
 
     PHI->addIncoming(StoreValue, L->getLoopLatch());
 
     Cand.Load->replaceAllUsesWith(PHI);
+    PHI->setDebugLoc(Cand.Load->getDebugLoc());
   }
 
   /// Top-level driver for each loop: find store->load forwarding
diff --git a/llvm/test/Transforms/LoopLoadElim/update_debugloc_store_forwarded.ll b/llvm/test/Transforms/LoopLoadElim/update_debugloc_store_forwarded.ll
new file mode 100644
index 0000000000000..cf09e0571c0c6
--- /dev/null
+++ b/llvm/test/Transforms/LoopLoadElim/update_debugloc_store_forwarded.ll
@@ -0,0 +1,85 @@
+; RUN: opt -passes=loop-load-elim -S < %s | FileCheck %s
+
+; LoopLoadElimination's propagateStoredValueToLoadUsers() replaces the
+; `load` (`%a`) with an hoisted initial `load` and a `phi` that forwards
+; the stored value.
+; This test checks that the debug location is propagated to the new `phi`
+; from the original `load` it replaces in block `%for.body` and the debug
+; location drop of the hoisted `load` in block `%entry`.
+; Moreover, this test also checks the debug location update of the new
+; `bitcast` created when the `load` type is mismatched with the `store` type:
+;   store i32 ...
+;   %a = load float, ...
+; Because the `bitcast` casts the old `load` value, it has the same debug
+; location as the old `load` (ie., the same as the new `phi`).
+
+; If the store and the load use different types, but have the same
+; size then we should still be able to forward the value.
+;
+;   for (unsigned i = 0; i < 100; i++) {
+;     A[i+1] = B[i] + 2;
+;     C[i] = ((float*)A)[i] * 2;
+;   }
+
+define void @f(ptr noalias %A, ptr noalias %B, ptr noalias %C, i64 %N) !dbg !5 {
+; CHECK-LABEL: define void @f(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[LOAD_INITIAL:%.*]] = load float, ptr {{.*}}, align 4{{$}}
+; CHECK:       for.body:
+; CHECK-NEXT:    [[STORE_FORWARDED:%.*]] = phi float [ [[LOAD_INITIAL]], %entry ], [ [[STORE_FORWARD_CAST:%.*]], %for.body ], !dbg [[DBG9:![0-9]+]]
+; CHECK:         [[STORE_FORWARD_CAST]] = bitcast i32 {{.*}} to float, !dbg [[DBG9]]
+; CHECK:       [[DBG9]] = !DILocation(line: 11,
+;
+entry:
+  br label %for.body, !dbg !8
+
+for.body:                                         ; preds = %for.body, %entry
+  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ], !dbg !9
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !10
+  %Aidx_next = getelementptr inbounds i32, ptr %A, i64 %indvars.iv.next, !dbg !11
+  %Bidx = getelementptr inbounds i32, ptr %B, i64 %indvars.iv, !dbg !12
+  %Cidx = getelementptr inbounds i32, ptr %C, i64 %indvars.iv, !dbg !13
+  %Aidx = getelementptr inbounds i32, ptr %A, i64 %indvars.iv, !dbg !14
+  %b = load i32, ptr %Bidx, align 4, !dbg !15
+  %a_p1 = add i32 %b, 2, !dbg !16
+  store i32 %a_p1, ptr %Aidx_next, align 4, !dbg !17
+  %a = load float, ptr %Aidx, align 4, !dbg !18
+  %c = fmul float %a, 2.000000e+00, !dbg !19
+  %c.int = fptosi float %c to i32, !dbg !20
+  store i32 %c.int, ptr %Cidx, align 4, !dbg !21
+  %exitcond = icmp eq i64 %indvars.iv.next, %N, !dbg !22
+  br i1 %exitcond, label %for.end, label %for.body, !dbg !23
+
+for.end:                                          ; preds = %for.body
+  ret void, !dbg !24
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.debugify = !{!2, !3}
+!llvm.module.flags = !{!4}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
+!1 = !DIFile(filename: "type-mismatch.ll", directory: "/")
+!2 = !{i32 17}
+!3 = !{i32 0}
+!4 = !{i32 2, !"Debug Info Version", i32 3}
+!5 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !1, line: 1, type: !6, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0)
+!6 = !DISubroutineType(types: !7)
+!7 = !{}
+!8 = !DILocation(line: 1, column: 1, scope: !5)
+!9 = !DILocation(line: 2, column: 1, scope: !5)
+!10 = !DILocation(line: 3, column: 1, scope: !5)
+!11 = !DILocation(line: 4, column: 1, scope: !5)
+!12 = !DILocation(line: 5, column: 1, scope: !5)
+!13 = !DILocation(line: 6, column: 1, scope: !5)
+!14 = !DILocation(line: 7, column: 1, scope: !5)
+!15 = !DILocation(line: 8, column: 1, scope: !5)
+!16 = !DILocation(line: 9, column: 1, scope: !5)
+!17 = !DILocation(line: 10, column: 1, scope: !5)
+!18 = !DILocation(line: 11, column: 1, scope: !5)
+!19 = !DILocation(line: 12, column: 1, scope: !5)
+!20 = !DILocation(line: 13, column: 1, scope: !5)
+!21 = !DILocation(line: 14, column: 1, scope: !5)
+!22 = !DILocation(line: 15, column: 1, scope: !5)
+!23 = !DILocation(line: 16, column: 1, scope: !5)
+!24 = !DILocation(line: 17, column: 1, scope: !5)

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch!

Test LGTM. There are just a couple of minor nits and a question about the phi location to address.


PHI->addIncoming(StoreValue, L->getLoopLatch());

Cand.Load->replaceAllUsesWith(PHI);
PHI->setDebugLoc(Cand.Load->getDebugLoc());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting case. When promoting store/loads I think we usually give phis a debugloc from the store (or if it's from stores, plural, it's a merged location). At least that's what we do in mem2reg. I think in this case using the load's DebugLoc makes intuitive sense - we're just eliminating the load here after all, but I'd be interested to hear what others think about this case too (cc @SLTozer , @adrian-prantl , @dwblaikie, @jryans - there's a comment at the top of the function that explains the transformation) and what you make of this @Apochens?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the load's location here does make intuitive sense to me as well. I'm less sure what we've done for such cases historically though, so I'm interested to hear what others think as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case using the load's DebugLoc makes intuitive sense - we're just eliminating the load here after all

That's what I agree with.

When promoting store/loads I think we usually give phis a debugloc from the store (or if it's from stores, plural, it's a merged location).

I'd like to know why give the store'd debug location to an instruction that replaces a load instruction. Are there some conventions or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, should we ping other guys again? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to wait a few more days first, as people may be on holiday or have a long queue to work through.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the git history, it doesn't look like there's a particular reason for using the store instead of the load - the main point was to ensure that the scope was correct. I think using the load makes more sense, so current state SGTM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like enough of us have appeared and all concluded the same thing here, so I think we can leave this part as it is.

@jryans
Copy link
Member

jryans commented May 15, 2024

Once the remaining review comments are addressed, I imagine this can be approved. 🙂

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @jryans mentions in the inline comments, it looks like we've reached a consensus that the eliminated load's debug location is indeed suitable for the phi. If someone comes back to this and strongly disagrees we can look at it again and make adjustments as necessary.

LGTM, thanks!

Copy link
Member

@jryans jryans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as well! 😄

As usual, I'll wait a day in case any last minute review comments might appear, and then assuming they do not, I'll merge this.

@jryans jryans merged commit d0e2808 into llvm:main May 17, 2024
4 checks passed
@Apochens Apochens deleted the #91836_looploadelim_update_debugloc branch May 17, 2024 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DebugInfo][LoopLoadElim] Missing debug location updates
5 participants