Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TBAA] Extend pointer TBAA to pointers of non-builtin types. #110569

Merged
merged 3 commits into from
Oct 22, 2024

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Sep 30, 2024

Extend the logic added in 123c036
(#76612) to support pointers to non-builtin types by using the mangled name of the canonical type.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen labels Sep 30, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Florian Hahn (fhahn)

Changes

Extend the logic added in 123c036
(#76612) to support pointers to non-builtin types by using the mangled name of the canonical type.


Full diff: https://github.com/llvm/llvm-project/pull/110569.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenTBAA.cpp (+11-5)
  • (modified) clang/test/CodeGen/tbaa-pointers.c (+27-11)
  • (modified) clang/test/CodeGen/tbaa-reference.cpp (+23-7)
diff --git a/clang/lib/CodeGen/CodeGenTBAA.cpp b/clang/lib/CodeGen/CodeGenTBAA.cpp
index 5b3393ec150e44..990a0ea8dafd72 100644
--- a/clang/lib/CodeGen/CodeGenTBAA.cpp
+++ b/clang/lib/CodeGen/CodeGenTBAA.cpp
@@ -221,21 +221,27 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) {
       PtrDepth++;
       Ty = Ty->getPointeeType().getTypePtr();
     } while (Ty->isPointerType());
-    // TODO: Implement C++'s type "similarity" and consider dis-"similar"
-    // pointers distinct for non-builtin types.
+
+    SmallString<256> TyName;
     if (isa<BuiltinType>(Ty)) {
       llvm::MDNode *ScalarMD = getTypeInfoHelper(Ty);
       StringRef Name =
           cast<llvm::MDString>(
               ScalarMD->getOperand(CodeGenOpts.NewStructPathTBAA ? 2 : 0))
               ->getString();
+      TyName = Name;
+    } else if (!isa<VariableArrayType>(Ty)) {
+      // For non-builtin types use the mangled name of the canonical type.
+      llvm::raw_svector_ostream TyOut(TyName);
+      Context.createMangleContext()->mangleCanonicalTypeName(QualType(Ty, 0),
+                                                             TyOut);
+    }
+
       SmallString<256> OutName("p");
       OutName += std::to_string(PtrDepth);
       OutName += " ";
-      OutName += Name;
+      OutName += TyName;
       return createScalarTypeNode(OutName, AnyPtr, Size);
-    }
-    return AnyPtr;
   }
 
   // Accesses to arrays are accesses to objects of their element types.
diff --git a/clang/test/CodeGen/tbaa-pointers.c b/clang/test/CodeGen/tbaa-pointers.c
index 8860b7042d0a25..2b6c2f1418b50f 100644
--- a/clang/test/CodeGen/tbaa-pointers.c
+++ b/clang/test/CodeGen/tbaa-pointers.c
@@ -116,10 +116,12 @@ void p2struct(struct S1 **ptr) {
   // COMMON-LABEL: define void @p2struct(
   // COMMON-SAME:    ptr noundef [[PTR:%.+]])
   // COMMON:         [[PTR_ADDR:%.+]] = alloca ptr, align 8
-  // ENABLED-NEXT:    store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR:!.+]]
+  // ENABLED-NEXT:    store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[P2S1_TAG:!.+]]
   // DEFAULT-NEXT:    store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
-  // COMMON-NEXT:    [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
-  // COMMON-NEXT:    store ptr null, ptr [[BASE]], align 8, !tbaa [[ANYPTR]]
+  // ENABLED-NEXT:    [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[P2S1_TAG]]
+  // DEFAULT-NEXT:    [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
+  // ENABLED-NEXT:    store ptr null, ptr [[BASE]], align 8, !tbaa [[P1S1_TAG:!.+]]
+  // DEFAULT-NEXT:    store ptr null, ptr [[BASE]], align 8, !tbaa [[ANYPTR]]
   // COMMON-NEXT:    ret void
   //
   *ptr = 0;
@@ -129,9 +131,10 @@ void p2struct_const(struct S1 const **ptr) {
   // COMMON-LABEL: define void @p2struct_const(
   // COMMON-SAME:    ptr noundef [[PTR:%.+]])
   // COMMON:         [[PTR_ADDR:%.+]] = alloca ptr, align 8
-  // COMMON-NEXT:    store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
+  // COMMON-NEXT:    store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR:!.+]]
   // COMMON-NEXT:    [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
-  // COMMON-NEXT:    store ptr null, ptr [[BASE]], align 8, !tbaa [[ANYPTR]]
+  // ENABLED-NEXT:    store ptr null, ptr [[BASE]], align 8, !tbaa [[P1S1_TAG]]
+  // DEFAULT-NEXT:    store ptr null, ptr [[BASE]], align 8, !tbaa [[ANYPTR]]
   // COMMON-NEXT:    ret void
   //
   *ptr = 0;
@@ -145,10 +148,14 @@ void p2struct2(struct S2 *ptr) {
   // COMMON-LABEL: define void @p2struct2(
   // COMMON-SAME:    ptr noundef [[PTR:%.+]])
   // COMMON:         [[PTR_ADDR:%.+]] = alloca ptr, align 8
-  // COMMON-NEXT:    store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
-  // COMMON-NEXT:    [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
-  // COMMON-NEXT:    [[S:%.+]] = getelementptr inbounds nuw %struct.S2, ptr [[BASE]], i32 0, i32 0
-  // COMMON-NEXT:    store ptr null, ptr [[S]], align 8, !tbaa [[S2_S_TAG:!.+]]
+  // ENABLED-NEXT:   store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[P1S2_TAG:!.+]]
+  // ENABLED-NEXT:   [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[P1S2_TAG]]
+  // ENABLED-NEXT:   [[S:%.+]] = getelementptr inbounds nuw %struct.S2, ptr [[BASE]], i32 0, i32 0
+  // ENABLED-NEXT:   store ptr null, ptr [[S]], align 8, !tbaa [[S2_S_TAG:!.+]]
+  // DEFAULT-NEXT:   store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
+  // DEFAULT-NEXT:   [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
+  // DEFAULT-NEXT:   [[S:%.+]] = getelementptr inbounds nuw %struct.S2, ptr [[BASE]], i32 0, i32 0
+  // DEFAULT-NEXT:   store ptr null, ptr [[S]], align 8, !tbaa [[S2_S_TAG:!.+]]
   // COMMON-NEXT:    ret void
     ptr->s = 0;
 }
@@ -171,5 +178,14 @@ void p2struct2(struct S2 *ptr) {
 // ENABLED: [[P2CHAR]] = !{!"p2 omnipotent char", [[ANY_POINTER]], i64 0}
 // ENABLED: [[P1CHAR_0]] = !{[[P1CHAR:!.+]], [[P1CHAR]], i64 0}
 // ENABLED: [[P1CHAR]] = !{!"p1 omnipotent char", [[ANY_POINTER]], i64 0}
-// COMMON: [[S2_S_TAG]]  = !{[[S2_TY:!.+]], [[ANY_POINTER]], i64 0}
-// COMMON: [[S2_TY]]  = !{!"S2", [[ANY_POINTER]], i64 0}
+// ENABLED: [[P2S1_TAG]] = !{[[P2S1:!.+]], [[P2S1]], i64 0}
+// ENABLED: [[P2S1]] = !{!"p2 _ZTS2S1", [[ANY_POINTER]], i64 0}
+// ENABLED: [[P1S1_TAG:!.+]] = !{[[P1S1:!.+]], [[P1S1]], i64 0}
+// ENABLED: [[P1S1]] = !{!"p1 _ZTS2S1", [[ANY_POINTER]], i64 0}
+// ENABLED: [[P1S2_TAG]] = !{[[P1S2:!.+]], [[P1S2]], i64 0}
+// ENABLED: [[P1S2]] = !{!"p1 _ZTS2S2", [[ANY_POINTER]], i64 0}
+
+// ENABLED: [[S2_S_TAG]]  = !{[[S2_TY:!.+]], [[P1S1]], i64 0}
+// ENABLED: [[S2_TY]]  = !{!"S2", [[P1S1]], i64 0}
+// DEFAULT: [[S2_S_TAG]]  = !{[[S2_TY:!.+]], [[ANY_POINTER]], i64 0}
+// DEFAULT: [[S2_TY]]  = !{!"S2", [[ANY_POINTER]], i64 0}
diff --git a/clang/test/CodeGen/tbaa-reference.cpp b/clang/test/CodeGen/tbaa-reference.cpp
index d22cd90b43ae90..8395badf35ded5 100644
--- a/clang/test/CodeGen/tbaa-reference.cpp
+++ b/clang/test/CodeGen/tbaa-reference.cpp
@@ -1,7 +1,7 @@
 // RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s -emit-llvm -o - | FileCheck %s  -check-prefixes=CHECK,OLD-PATH
-// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes -pointer-tbaa %s -emit-llvm -o - | FileCheck %s  -check-prefixes=CHECK,OLD-PATH
+// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes -pointer-tbaa %s -emit-llvm -o - | FileCheck %s  -check-prefixes=CHECK,OLD-PATH-POINTER
 // RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s -emit-llvm -new-struct-path-tbaa -o - | FileCheck %s -check-prefixes=CHECK,NEW-PATH
-// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s -pointer-tbaa -emit-llvm -new-struct-path-tbaa -o - | FileCheck %s -check-prefixes=CHECK,NEW-PATH
+// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s -pointer-tbaa -emit-llvm -new-struct-path-tbaa -o - | FileCheck %s -check-prefixes=CHECK,NEW-PATH-POINTER
 //
 // Check that we generate correct TBAA information for reference accesses.
 
@@ -16,13 +16,13 @@ struct B {
 B::B(S &s) : s(s) {
 // CHECK-LABEL: _ZN1BC2ER1S
 // Check initialization of the reference parameter.
-// CHECK: store ptr {{.*}}, ptr {{.*}}, !tbaa [[TAG_pointer:!.*]]
+// CHECK: store ptr {{.*}}, ptr %s.addr, align 8, !tbaa [[TAG_S_PTR:!.*]]
 
 // Check loading of the reference parameter.
-// CHECK: load ptr, ptr {{.*}}, !tbaa [[TAG_pointer]]
+// CHECK: load ptr, ptr {{.*}}, !tbaa [[TAG_S_PTR:!.*]]
 
 // Check initialization of the reference member.
-// CHECK: store ptr {{.*}}, ptr {{.*}}, !tbaa [[TAG_pointer]]
+// CHECK: store ptr {{.*}}, ptr {{.*}}, !tbaa [[TAG_S_PTR]]
 }
 
 S &B::get() {
@@ -32,16 +32,32 @@ S &B::get() {
   return s;
 }
 
-// OLD-PATH-DAG: [[TAG_pointer]] = !{[[TYPE_pointer:!.*]], [[TYPE_pointer]], i64 0}
+// OLD-PATH-DAG: [[TAG_S_PTR]] = !{[[TYPE_pointer:!.*]], [[TYPE_pointer]], i64 0}
 // OLD-PATH-DAG: [[TAG_B_s]] = !{[[TYPE_B:!.*]], [[TYPE_pointer]], i64 0}
 //
 // OLD-PATH-DAG: [[TYPE_B]] = !{!"_ZTS1B", [[TYPE_pointer]], i64 0}
 // OLD-PATH-DAG: [[TYPE_pointer]] = !{!"any pointer", [[TYPE_char:!.*]], i64 0}
 // OLD-PATH-DAG: [[TYPE_char]] = !{!"omnipotent char", {{!.*}}, i64 0}
 
-// NEW-PATH-DAG: [[TAG_pointer]] = !{[[TYPE_pointer:!.*]], [[TYPE_pointer]], i64 0, i64 8}
+// OLD-PATH-POINTER-DAG: [[TAG_S_PTR]] = !{[[TYPE_S_PTR:!.*]], [[TYPE_S_PTR]], i64 0}
+// OLD-PATH-POINTER-DAG: [[TAG_B_s]] = !{[[TYPE_B:!.*]], [[TYPE_S_PTR:!.*]], i64 0}
+//
+// OLD-PATH-POINTER-DAG: [[TYPE_B]] = !{!"_ZTS1B", [[TYPE_S_PTR:!.*]], i64 0}
+// OLD-PATH-POINTER-DAG: [[TYPE_pointer:!.*]] = !{!"any pointer", [[TYPE_char:!.*]], i64 0}
+// OLD-PATH-POINTER-DAG: [[TYPE_char]] = !{!"omnipotent char", {{!.*}}, i64 0}
+// OLD-PATH-POINTER-DAG: [[TYPE_S_PTR]] = !{!"p1 _ZTS1S", [[TYPE_pointer]], i64 0}
+
+// NEW-PATH-DAG: [[TAG_S_PTR]] = !{[[TYPE_pointer:!.*]], [[TYPE_pointer]], i64 0, i64 8}
 // NEW-PATH-DAG: [[TAG_B_s]] = !{[[TYPE_B:!.*]], [[TYPE_pointer]], i64 0, i64 8}
 //
 // NEW-PATH-DAG: [[TYPE_B]] = !{[[TYPE_char:!.*]], i64 8, !"_ZTS1B", [[TYPE_pointer]], i64 0, i64 8}
 // NEW-PATH-DAG: [[TYPE_pointer]] = !{[[TYPE_char:!.*]], i64 8, !"any pointer"}
 // NEW-PATH-DAG: [[TYPE_char]] = !{{{!.*}}, i64 1, !"omnipotent char"}
+
+// NEW-PATH-POINTER-DAG: [[TAG_S_PTR]] = !{[[TYPE_S_PTR:!.*]], [[TYPE_S_PTR]], i64 0, i64 8}
+// NEW-PATH-POINTER-DAG: [[TAG_B_s]] = !{[[TYPE_B:!.*]], [[TYPE_S_PTR]], i64 0, i64 8}
+//
+// NEW-PATH-POINTER-DAG: [[TYPE_B]] = !{[[TYPE_char:!.*]], i64 8, !"_ZTS1B", [[TYPE_S_PTR]], i64 0, i64 8}
+// NEW-PATH-POINTER-DAG: [[TYPE_S_PTR]] = !{[[TYPE_pointer:!.+]], i64 8, !"p1 _ZTS1S"}
+// NEW-PATH-POINTER-DAG: [[TYPE_pointer]] = !{[[TYPE_char:!.*]], i64 8, !"any pointer"}
+// NEW-PATH-POINTER-DAG: [[TYPE_char]] = !{{{!.*}}, i64 1, !"omnipotent char"}

@fhahn
Copy link
Contributor Author

fhahn commented Oct 14, 2024

ping :)

clang/lib/CodeGen/CodeGenTBAA.cpp Show resolved Hide resolved
if (isa<BuiltinType>(Ty)) {
llvm::MDNode *ScalarMD = getTypeInfoHelper(Ty);
StringRef Name =
cast<llvm::MDString>(
ScalarMD->getOperand(CodeGenOpts.NewStructPathTBAA ? 2 : 0))
->getString();
TyName = Name;
} else if (!isa<VariableArrayType>(Ty)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should just ignore all array structure here, whether it's variable-length or fixed-length. For one, partially honoring array structures probably violates the VLA compatibility rules — IIUC, the C standard requires l-values of type T (*)[n] to be able to alias objects of type T (*)[10] if n == 10 dynamically, but with your patch, we'd now create a node for that pointer-to-constant-array type with no relation to the node we'd use for the pointer-to-VLA type. But also, I feel like trying to honor the array structure is just pointlessly strict; we can treat a T (*)[10] object like a T * object, it's not really going to hurt anything.

You should look through the array structure before you do the builtin/non-builtin breakdown. There's a getBaseElementType operation you can use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it should be adjusted now.

} else if (!isa<VariableArrayType>(Ty)) {
// For non-builtin types use the mangled name of the canonical type.
llvm::raw_svector_ostream TyOut(TyName);
Context.createMangleContext()->mangleCanonicalTypeName(QualType(Ty, 0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Maybe we should specifically force the use of Itanium mangling for this, just so we're guaranteed a certain amount of consistency between language modes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to retrieve a pointer to the Itanium mangler once and use it here, thanks!

Copy link
Contributor

@rjmccall rjmccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM other than the memory leak.

@@ -119,6 +120,7 @@ class CodeGenTBAA {
llvm::Module &Module;
const CodeGenOptions &CodeGenOpts;
const LangOptions &Features;
MangleContext *MangleCtx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should either be a unique_ptr or you need to delete it in the destructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks that should be fixed now!

Extend the logic added in 123c036
(llvm#76612) to support pointers to
non-builtin types by using the mangled name of the canonical type.
Copy link
Contributor

@rjmccall rjmccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fhahn fhahn merged commit 4334f31 into llvm:main Oct 22, 2024
6 of 8 checks passed
@fhahn fhahn deleted the clang-pointer-tbaa2 branch October 22, 2024 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants