Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo][LoopStrengthReduce] Fix missing debug location updates #97519

Merged

Conversation

Apochens
Copy link
Contributor

@Apochens Apochens commented Jul 3, 2024

Fix #97510 .

Note that, for the new phi instruction NewPH, which replaces the old phi PH and the cast ShadowUse, I choose to propagate the debug location of PH to it, because the cast is eliminated according to the optimization semantics.

BTW, after patching the debug location update of NewPH, I found that in the optimized IR, another phi instruction %lsr.iv has the same debug location as %IV.S. (NewPH). However, %lsr.iv should inherit the debug location of %iv in the unoptimized program, i.e., "line: 3" instead of "line: 2". The same thing happens to %lsr.iv.next, which has the same debug location as %exitcond. Yet, %lsr.iv.next should inherit the debug location of %iv.next, i.e., "line: 7" instead of "line: 8". The program before and after LSR are shown below.

Before optimization:

define i32 @foobar6() !dbg !5 {
entry:
  br label %loop, !dbg !8

loop:                                             ; preds = %loop, %entry
  %accum = phi i32 [ -3220, %entry ], [ %accum.next, %loop ], !dbg !9
->%iv = phi i32 [ 12, %entry ], [ %iv.next, %loop ], !dbg !10
  %tmp1 = sitofp i32 %accum to double, !dbg !11
  tail call void @foo(double %tmp1), !dbg !12
  %accum.next = add nsw i32 %accum, 9597741, !dbg !13
->%iv.next = add nuw nsw i32 %iv, 1, !dbg !14
  %exitcond = icmp ugt i32 %iv, 235, !dbg !15
  br i1 %exitcond, label %exit, label %loop, !dbg !16

exit:                                             ; preds = %loop
  ret i32 %accum.next, !dbg !17
}

...
!9 = !DILocation(line: 2, column: 1, scope: !5)
!10 = !DILocation(line: 3, column: 1, scope: !5)
...
!14 = !DILocation(line: 7, column: 1, scope: !5)
!15 = !DILocation(line: 8, column: 1, scope: !5)
...

After optimization:

define i32 @foobar6() !dbg !5 {
entry:
  br label %loop, !dbg !8

loop:                                             ; preds = %loop, %entry
->%lsr.iv = phi i32 [ %lsr.iv.next, %loop ], [ 11, %entry ], !dbg !9
  %IV.S. = phi double [ -3.220000e+03, %entry ], [ %IV.S.next., %loop ], !dbg !9
  tail call void @foo(double %IV.S.) #0, !dbg !10
  %IV.S.next. = fadd double %IV.S., 0x41624E65A0000000, !dbg !11
->%lsr.iv.next = add nuw nsw i32 %lsr.iv, 1, !dbg !12
  %exitcond = icmp ugt i32 %lsr.iv.next, 235, !dbg !12
  br i1 %exitcond, label %exit, label %loop, !dbg !13

exit:                                             ; preds = %loop
  ret i32 -2135478791, !dbg !14
}

...
!9 = !DILocation(line: 2, column: 1, scope: !5)
...
!12 = !DILocation(line: 8, column: 1, scope: !5)
...

I guess it's caused by inserting new instructions with IRBuilder, in which the next instructions are specified as the insertion points. (Yet, I haven't found the corresponding code.🧐)

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 3, 2024

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-llvm-transforms

Author: Shan Huang (Apochens)

Changes

Fix #97510 .

Note that, for the new phi instruction NewPH, which replaces the old phi PH and the cast ShadowUse, I choose to propagate the debug location of PH to it, because the cast is eliminated according to the optimization semantics.

BTW, after patching the debug location update of NewPH, I found that in the optimized IR, another phi instruction %lsr.iv has the same debug location as %IV.S. (NewPH). However, %lsr.iv should inherent the debug location of %iv in the unoptimized program, i.e., "line: 3" instead of "line: 2". The same thing happens to %lsr.iv.next, which has the same debug location as %exitcond. Yet, %lsr.iv.next should inherent the debug location of %iv.next, i.e., "line: 7" instead of "line: 8". The program before and after LSR are shown below.

Before optimization:

define i32 @<!-- -->foobar6() !dbg !5 {
entry:
  br label %loop, !dbg !8

loop:                                             ; preds = %loop, %entry
  %accum = phi i32 [ -3220, %entry ], [ %accum.next, %loop ], !dbg !9
  %iv = phi i32 [ 12, %entry ], [ %iv.next, %loop ], !dbg !10
  %tmp1 = sitofp i32 %accum to double, !dbg !11
  tail call void @<!-- -->foo(double %tmp1), !dbg !12
  %accum.next = add nsw i32 %accum, 9597741, !dbg !13
  %iv.next = add nuw nsw i32 %iv, 1, !dbg !14
  %exitcond = icmp ugt i32 %iv, 235, !dbg !15
  br i1 %exitcond, label %exit, label %loop, !dbg !16

exit:                                             ; preds = %loop
  ret i32 %accum.next, !dbg !17
}

...
!9 = !DILocation(line: 2, column: 1, scope: !5)
!10 = !DILocation(line: 3, column: 1, scope: !5)
...
!14 = !DILocation(line: 7, column: 1, scope: !5)
!15 = !DILocation(line: 8, column: 1, scope: !5)
...

After optimization:

define i32 @<!-- -->foobar6() !dbg !5 {
entry:
  br label %loop, !dbg !8

loop:                                             ; preds = %loop, %entry
-&gt;%lsr.iv = phi i32 [ %lsr.iv.next, %loop ], [ 11, %entry ], !dbg !9
  %IV.S. = phi double [ -3.220000e+03, %entry ], [ %IV.S.next., %loop ], !dbg !9
  tail call void @<!-- -->foo(double %IV.S.) #<!-- -->0, !dbg !10
  %IV.S.next. = fadd double %IV.S., 0x41624E65A0000000, !dbg !11
-&gt;%lsr.iv.next = add nuw nsw i32 %lsr.iv, 1, !dbg !12
  %exitcond = icmp ugt i32 %lsr.iv.next, 235, !dbg !12
  br i1 %exitcond, label %exit, label %loop, !dbg !13

exit:                                             ; preds = %loop
  ret i32 -2135478791, !dbg !14
}

...
!9 = !DILocation(line: 2, column: 1, scope: !5)
...
!12 = !DILocation(line: 8, column: 1, scope: !5)
...

I guess it's caused by inserting new instructions with IRBuilder, in which the next instructions are specified as the insertion points. (Yet, I haven't found the corresponding code.🧐)


Full diff: https://github.com/llvm/llvm-project/pull/97519.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp (+2)
  • (added) llvm/test/Transforms/LoopStrengthReduce/X86/preserving-debugloc-phi-binop.ll (+59)
diff --git a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
index 1c4a0d92dcde9..8ebead8b8fbd7 100644
--- a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
@@ -2404,6 +2404,7 @@ void LSRInstance::OptimizeShadowIV() {
 
     /* Add new PHINode. */
     PHINode *NewPH = PHINode::Create(DestTy, 2, "IV.S.", PH->getIterator());
+    NewPH->setDebugLoc(PH->getDebugLoc());
 
     /* create new increment. '++d' in above example. */
     Constant *CFP = ConstantFP::get(DestTy, C->getZExtValue());
@@ -2411,6 +2412,7 @@ void LSRInstance::OptimizeShadowIV() {
         Incr->getOpcode() == Instruction::Add ? Instruction::FAdd
                                               : Instruction::FSub,
         NewPH, CFP, "IV.S.next.", Incr->getIterator());
+    NewIncr->setDebugLoc(Incr->getDebugLoc());
 
     NewPH->addIncoming(NewInit, PH->getIncomingBlock(Entry));
     NewPH->addIncoming(NewIncr, PH->getIncomingBlock(Latch));
diff --git a/llvm/test/Transforms/LoopStrengthReduce/X86/preserving-debugloc-phi-binop.ll b/llvm/test/Transforms/LoopStrengthReduce/X86/preserving-debugloc-phi-binop.ll
new file mode 100644
index 0000000000000..1aecd34082a2a
--- /dev/null
+++ b/llvm/test/Transforms/LoopStrengthReduce/X86/preserving-debugloc-phi-binop.ll
@@ -0,0 +1,59 @@
+; RUN: opt -S -passes=loop-reduce -mtriple=x86_64-unknown-unknown < %s | FileCheck %s
+
+; Check that LoopStrengthReduce's OptimizeShadowIV() propagates the debug
+; locations of the old phi (`%accum`) and binop (`%accum.next`) instruction
+; to the new phi and binop instruction, respectively.
+
+target datalayout = "n8:16:32:64"
+
+define i32 @foobar6() !dbg !5 {
+; CHECK-LABEL: define i32 @foobar6(
+; CHECK:  loop:
+; CHECK:    [[IV_S_:%.*]] = phi double [ -3.220000e+03, %[[ENTRY:.*]] ], [ [[IV_S_NEXT_:%.*]], %loop ], !dbg [[DBG9:![0-9]+]]
+; CHECK:    [[IV_S_NEXT_]] = fadd double [[IV_S_]], 0x41624E65A0000000, !dbg [[DBG11:![0-9]+]]
+; CHECK:  exit:
+;
+entry:
+  br label %loop, !dbg !8
+
+loop:                                             ; preds = %loop, %entry
+  %accum = phi i32 [ -3220, %entry ], [ %accum.next, %loop ], !dbg !9
+  %iv = phi i32 [ 12, %entry ], [ %iv.next, %loop ], !dbg !10
+  %tmp1 = sitofp i32 %accum to double, !dbg !11
+  tail call void @foo(double %tmp1), !dbg !12
+  %accum.next = add nsw i32 %accum, 9597741, !dbg !13
+  %iv.next = add nuw nsw i32 %iv, 1, !dbg !14
+  %exitcond = icmp ugt i32 %iv, 235, !dbg !15
+  br i1 %exitcond, label %exit, label %loop, !dbg !16
+
+exit:                                             ; preds = %loop
+  ret i32 %accum.next, !dbg !17
+}
+
+declare void @foo(double)
+
+!llvm.dbg.cu = !{!0}
+!llvm.debugify = !{!2, !3}
+!llvm.module.flags = !{!4}
+
+; CHECK: [[DBG9]] = !DILocation(line: 2,
+; CHECK: [[DBG11]] = !DILocation(line: 6,
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
+!1 = !DIFile(filename: "main.ll", directory: "/")
+!2 = !{i32 10}
+!3 = !{i32 0}
+!4 = !{i32 2, !"Debug Info Version", i32 3}
+!5 = distinct !DISubprogram(name: "foobar6", linkageName: "foobar6", scope: null, file: !1, line: 1, type: !6, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0)
+!6 = !DISubroutineType(types: !7)
+!7 = !{}
+!8 = !DILocation(line: 1, column: 1, scope: !5)
+!9 = !DILocation(line: 2, column: 1, scope: !5)
+!10 = !DILocation(line: 3, column: 1, scope: !5)
+!11 = !DILocation(line: 4, column: 1, scope: !5)
+!12 = !DILocation(line: 5, column: 1, scope: !5)
+!13 = !DILocation(line: 6, column: 1, scope: !5)
+!14 = !DILocation(line: 7, column: 1, scope: !5)
+!15 = !DILocation(line: 8, column: 1, scope: !5)
+!16 = !DILocation(line: 9, column: 1, scope: !5)
+!17 = !DILocation(line: 10, column: 1, scope: !5)

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch LGTM, @OCHyams @SLTozer have reviewed more of these, are there any more concerns than out-of-place source locations and potential loss of coverage?

Instructions inheriting the adjacent source location is technically part of LLVMs design, I think the difficulty is that it works well 99.9% of the time, but occasionally you need more finesse when picking them. The current solution is just a very good approximation of ideal source location placement (as far as I'm aware).

@OCHyams
Copy link
Contributor

OCHyams commented Jul 8, 2024

Note that, for the new phi instruction NewPH, which replaces the old phi PH and the cast ShadowUse, I choose to propagate the debug location of PH to it, because the cast is eliminated according to the optimization semantics.

SGTM - we're replacing two instructions (phi and add) in order to make one redundant (cast), it makes sense that we propagate the locations of the replaced instructions.

@OCHyams @SLTozer have reviewed more of these, are there any more concerns than out-of-place source locations and potential loss of coverage?

No concern for this patch, LGTM too.

Thanks!

@Apochens Apochens merged commit d83d09f into llvm:main Jul 15, 2024
10 checks passed
@Apochens Apochens deleted the #97510_lsr_preserving_debugloc_phi_binop branch July 15, 2024 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DebugInfo][LoopStrengthReduce] Missing debug location updates for new phi and binop instructions
4 participants