Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-fno-plt: Infer nonlazybind at -O0 #97873

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Jul 6, 2024

At -O1 and above, inferNonMandatoryLibFuncAttrs of InferFunctionAttrs
infers the function attribute nonlazybind from the module flags metadata
"RtLibUseGOT". At -O0, synthesized functions do not get nonlazybind.
Backends do not check "RtLibUseGOT" for non-intrinsic calls, making
-fno-plt ineffective for some functions like __cxa_atexit.

This patch adds a O0 version of InferFunctionAttrs to infer nonlazybind.

SimplifyLibCalls.cpp may convert printf to puts and need
inferNonMandatoryLibFuncAttrs to infer nonlazybind for puts
(https://reviews.llvm.org/D45180). We cannot remove setNonlazyBind
from BuildLibCalls.cpp.

Close #92853

Created using spr 1.3.5-bogner
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 6, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Fangrui Song (MaskRay)

Changes

At -O1 and above, inferNonMandatoryLibFuncAttrs of InferFunctionAttrs
infers the function attribute nonlazybind from the module flags metadata
"RtLibUseGOT". At -O0, synthesized functions do not get nonlazybind.
Backends do not check "RtLibUseGOT" for non-intrinsic calls, making
-fno-plt ineffective for some functions like __cxa_atexit.

This patch adds a O0 version of InferFunctionAttrs to infer nonlazybind.

SimplifyLibCalls.cpp may convert printf to puts and need
inferNonMandatoryLibFuncAttrs to infer nonlazybind for puts
(https://reviews.llvm.org/D45180). We cannot remove setNonlazyBind
from BuildLibCalls.cpp.

Close #92853


Full diff: https://github.com/llvm/llvm-project/pull/97873.diff

8 Files Affected:

  • (modified) llvm/include/llvm/Transforms/IPO/InferFunctionAttrs.h (+4)
  • (modified) llvm/lib/Passes/PassBuilder.cpp (+4)
  • (modified) llvm/lib/Passes/PassBuilderPipelines.cpp (+2)
  • (modified) llvm/lib/Passes/PassRegistry.def (+4-1)
  • (modified) llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp (+14-2)
  • (modified) llvm/test/Other/new-pass-manager.ll (+1)
  • (modified) llvm/test/Other/new-pm-O0-defaults.ll (+2)
  • (added) llvm/test/Transforms/InferFunctionAttrs/nonlazybind.ll (+57)
diff --git a/llvm/include/llvm/Transforms/IPO/InferFunctionAttrs.h b/llvm/include/llvm/Transforms/IPO/InferFunctionAttrs.h
index 8addf49fc0d81..b0ec1587b56f0 100644
--- a/llvm/include/llvm/Transforms/IPO/InferFunctionAttrs.h
+++ b/llvm/include/llvm/Transforms/IPO/InferFunctionAttrs.h
@@ -23,7 +23,11 @@ class Module;
 /// A pass which infers function attributes from the names and signatures of
 /// function declarations in a module.
 struct InferFunctionAttrsPass : PassInfoMixin<InferFunctionAttrsPass> {
+  InferFunctionAttrsPass(bool O0 = false) : O0(O0) {}
   PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
+
+private:
+  bool O0;
 };
 
 }
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 17cc156846d36..c61816579f4c7 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -1145,6 +1145,10 @@ Expected<GlobalMergeOptions> parseGlobalMergeOptions(StringRef Params) {
   return Result;
 }
 
+Expected<bool> parseInferFunctionAttrsOptions(StringRef Params) {
+  return PassBuilder::parseSinglePassOption(Params, "O0", "InferFunctionAttrs");
+}
+
 Expected<SmallVector<std::string, 0>> parseInternalizeGVs(StringRef Params) {
   SmallVector<std::string, 1> PreservedGVs;
   while (!Params.empty()) {
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 4fd5ee1946bb7..b9f97699645a9 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -2080,6 +2080,8 @@ ModulePassManager PassBuilder::buildO0DefaultPipeline(OptimizationLevel Level,
   if (PGOOpt && PGOOpt->DebugInfoForProfiling)
     MPM.addPass(createModuleToFunctionPassAdaptor(AddDiscriminatorsPass()));
 
+  MPM.addPass(InferFunctionAttrsPass(/*O0=*/true));
+
   invokePipelineEarlySimplificationEPCallbacks(MPM, Level);
 
   // Build a minimal pipeline based on the semantics required by LLVM,
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index 3b92823cd283b..c411d6da2e3d0 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -68,7 +68,6 @@ MODULE_PASS("hipstdpar-interpose-alloc", HipStdParAllocationInterpositionPass())
 MODULE_PASS("hipstdpar-select-accelerator-code",
             HipStdParAcceleratorCodeSelectionPass())
 MODULE_PASS("hotcoldsplit", HotColdSplittingPass())
-MODULE_PASS("inferattrs", InferFunctionAttrsPass())
 MODULE_PASS("inliner-ml-advisor-release",
             ModuleInlinerWrapperPass(getInlineParams(), true, {},
                                      InliningAdvisorMode::Release, 0))
@@ -176,6 +175,10 @@ MODULE_PASS_WITH_PARAMS(
     "hwasan", "HWAddressSanitizerPass",
     [](HWAddressSanitizerOptions Opts) { return HWAddressSanitizerPass(Opts); },
     parseHWASanPassOptions, "kernel;recover")
+MODULE_PASS_WITH_PARAMS(
+    "inferattrs", "InferFunctionAttrsPass",
+    [](bool O0) { return InferFunctionAttrsPass(O0); },
+    parseInferFunctionAttrsOptions, "O0")
 MODULE_PASS_WITH_PARAMS(
     "internalize", "InternalizePass",
     [](SmallVector<std::string, 0> PreservedGVs) {
diff --git a/llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp b/llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
index 18d5911d10f12..6f26934cd6b0c 100644
--- a/llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
+++ b/llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
@@ -17,8 +17,20 @@ using namespace llvm;
 #define DEBUG_TYPE "inferattrs"
 
 static bool inferAllPrototypeAttributes(
-    Module &M, function_ref<TargetLibraryInfo &(Function &)> GetTLI) {
+    Module &M, bool O0, function_ref<TargetLibraryInfo &(Function &)> GetTLI) {
   bool Changed = false;
+  // Infer nonlazybind if "RtLibUseGOT" (-fno-plt) is set. This is performed
+  // even at -O0.
+  if (M.getRtLibUseGOT()) {
+    for (Function &F : M.functions()) {
+      if (F.isDeclaration() && !F.hasFnAttribute(Attribute::NonLazyBind)) {
+        F.addFnAttr(Attribute::NonLazyBind);
+        Changed = true;
+      }
+    }
+  }
+  if (O0)
+    return Changed;
 
   for (Function &F : M.functions())
     // We only infer things using the prototype and the name; we don't need
@@ -43,7 +55,7 @@ PreservedAnalyses InferFunctionAttrsPass::run(Module &M,
     return FAM.getResult<TargetLibraryAnalysis>(F);
   };
 
-  if (!inferAllPrototypeAttributes(M, GetTLI))
+  if (!inferAllPrototypeAttributes(M, O0, GetTLI))
     // If we didn't infer anything, preserve all analyses.
     return PreservedAnalyses::all();
 
diff --git a/llvm/test/Other/new-pass-manager.ll b/llvm/test/Other/new-pass-manager.ll
index f0fe708806f1b..60040d73f5b98 100644
--- a/llvm/test/Other/new-pass-manager.ll
+++ b/llvm/test/Other/new-pass-manager.ll
@@ -292,6 +292,7 @@
 ; RUN:     | FileCheck %s --check-prefix=CHECK-O0 --check-prefix=%llvmcheckext
 ; CHECK-O0: Running analysis: InnerAnalysisManagerProxy<{{.*}}>
 ; CHECK-O0-NEXT: Running pass: EntryExitInstrumenterPass
+; CHECK-O0-NEXT: Running pass: InferFunctionAttrsPass
 ; CHECK-O0-NEXT: Running pass: AlwaysInlinerPass
 ; CHECK-O0-NEXT: Running analysis: ProfileSummaryAnalysis
 ; CHECK-EXT-NEXT: Running pass: {{.*}}Bye
diff --git a/llvm/test/Other/new-pm-O0-defaults.ll b/llvm/test/Other/new-pm-O0-defaults.ll
index e8131ac7fab45..3505061a53307 100644
--- a/llvm/test/Other/new-pm-O0-defaults.ll
+++ b/llvm/test/Other/new-pm-O0-defaults.ll
@@ -32,10 +32,12 @@
 ; CHECK-DIS: Running analysis: InnerAnalysisManagerProxy
 ; CHECK-DIS-NEXT: Running pass: EntryExitInstrumenterPass
 ; CHECK-DIS-NEXT: Running pass: AddDiscriminatorsPass
+; CHECK-DIS-NEXT: Running pass: InferFunctionAttrsPass
 ; CHECK-DIS-NEXT: Running pass: AlwaysInlinerPass
 ; CHECK-DIS-NEXT: Running analysis: ProfileSummaryAnalysis
 ; CHECK-DEFAULT: Running analysis: InnerAnalysisManagerProxy
 ; CHECK-DEFAULT-NEXT: Running pass: EntryExitInstrumenterPass
+; CHECK-DEFAULT-NEXT: Running pass: InferFunctionAttrsPass
 ; CHECK-DEFAULT-NEXT: Running pass: AlwaysInlinerPass
 ; CHECK-DEFAULT-NEXT: Running analysis: ProfileSummaryAnalysis
 ; CHECK-MATRIX: Running pass: LowerMatrixIntrinsicsPass
diff --git a/llvm/test/Transforms/InferFunctionAttrs/nonlazybind.ll b/llvm/test/Transforms/InferFunctionAttrs/nonlazybind.ll
new file mode 100644
index 0000000000000..4a401ab3c05fe
--- /dev/null
+++ b/llvm/test/Transforms/InferFunctionAttrs/nonlazybind.ll
@@ -0,0 +1,57 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes --check-globals all --version 5
+; RUN: opt < %s -S -passes='inferattrs<O0>' | FileCheck %s --check-prefix=O0
+; RUN: opt < %s -S -passes='inferattrs' | FileCheck %s --check-prefix=CHECK
+
+%struct.A = type { i8 }
+
+@a = dso_local global %struct.A zeroinitializer, align 1
+@__dso_handle = external hidden global i8
+@llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I_, ptr null }]
+
+declare void @external()
+declare void @external_optnone() noinline optnone
+declare void @_ZN1AD1Ev(ptr noundef nonnull align 1 dereferenceable(1)) unnamed_addr
+declare i32 @__cxa_atexit(ptr, ptr, ptr)
+
+;.
+; O0: @a = dso_local global %struct.A zeroinitializer, align 1
+; O0: @__dso_handle = external hidden global i8
+; O0: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I_, ptr null }]
+;.
+; CHECK: @a = dso_local global %struct.A zeroinitializer, align 1
+; CHECK: @__dso_handle = external hidden global i8
+; CHECK: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I_, ptr null }]
+;.
+define internal void @_GLOBAL__sub_I_() section ".text.startup" {
+; O0-LABEL: define internal void @_GLOBAL__sub_I_() section ".text.startup" {
+; O0-NEXT:    call void @external()
+; O0-NEXT:    call void @external_optnone()
+; O0-NEXT:    [[TMP0:%.*]] = tail call i32 @__cxa_atexit(ptr nonnull @_ZN1AD1Ev, ptr nonnull @a, ptr nonnull @__dso_handle)
+; O0-NEXT:    ret void
+;
+; CHECK-LABEL: define internal void @_GLOBAL__sub_I_() section ".text.startup" {
+; CHECK-NEXT:    call void @external()
+; CHECK-NEXT:    call void @external_optnone()
+; CHECK-NEXT:    [[TMP0:%.*]] = tail call i32 @__cxa_atexit(ptr nonnull @_ZN1AD1Ev, ptr nonnull @a, ptr nonnull @__dso_handle)
+; CHECK-NEXT:    ret void
+;
+  call void @external()
+  call void @external_optnone()
+  %x = tail call i32 @__cxa_atexit(ptr nonnull @_ZN1AD1Ev, ptr nonnull @a, ptr nonnull @__dso_handle)
+  ret void
+}
+
+!llvm.module.flags = !{!0}
+!0 = !{i32 7, !"RtLibUseGOT", i32 1}
+;.
+; O0: attributes #[[ATTR0:[0-9]+]] = { nonlazybind }
+; O0: attributes #[[ATTR1:[0-9]+]] = { noinline nonlazybind optnone }
+;.
+; CHECK: attributes #[[ATTR0:[0-9]+]] = { nonlazybind }
+; CHECK: attributes #[[ATTR1:[0-9]+]] = { noinline nonlazybind optnone }
+; CHECK: attributes #[[ATTR2:[0-9]+]] = { nofree nonlazybind }
+;.
+; O0: [[META0:![0-9]+]] = !{i32 7, !"RtLibUseGOT", i32 1}
+;.
+; CHECK: [[META0:![0-9]+]] = !{i32 7, !"RtLibUseGOT", i32 1}
+;.

@efriedma-quic
Copy link
Collaborator

Why do we wait to infer this in the first place, as opposed to just making the backend emit the right thing based on the module metadata? This seems fragile.

@MaskRay
Copy link
Member Author

MaskRay commented Jul 7, 2024

Why do we wait to infer this in the first place, as opposed to just making the backend emit the right thing based on the module metadata? This seems fragile.

My understanding:

The nonlazybind function attribute has been available for a long time, possibly originally used by ObjC for Mach-O.
ELF -fno-plt reused nonlazybind since the backend behavior is identical. For intrinsics, the module flags metadata "RtLibUseGOT" with the Max behavior is used.

Keeping function attributes (nonlazybind) allows fine-grained -fno-plt and -fplt merging for LTO (and also ObjC/Mach-O's needs).
If "RtLibUseGOT" with the Max is used exclusively instead, -fplt modules will use the -fno-plt behavior as well.

I agree that this appears slightly complicated, but the complexity seems manageable.

@efriedma-quic
Copy link
Collaborator

I'm mostly concerned about expanding the set of "mandatory pre-codegen" passes. We have a good story in CodeGen itself for scheduling mandatory and non-mandatory passes, and skipping passes, and serializing/restarting the pipeline. But pre-CodeGen passes fall into a weird void: you can get pre-mandatory-pass IR, and post-mandatory-pass IR, and you can't tell which one you have at a glance.

So I'd prefer to do any kind of mandatory lowering either in the frontend, or in CodeGen.

@MaskRay
Copy link
Member Author

MaskRay commented Jul 18, 2024

Perhaps we should introduce a lazybind function attribute and
determine GOT-indirect call sequence using if ("RtLibUseGOT" && !lazybind) instead of if (nonlazybind).
The new condition will work with mcount (-pg), __cxa_atexit (clangCodeGen), etc.

nonlazybind will only be used by ObjC.

During LTO, when a non-"RtLibUseGOT" module is merged into a "RtLibUseGOT" module, the former's functions get the lazybind attribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

__cxa_atexit@plt is used when -fno-plt is supplied
3 participants