Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HWASan] optimize AttrInfer fix for selective HWASan #108111

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

fmayer
Copy link
Contributor

@fmayer fmayer commented Sep 10, 2024

No description provided.

Created using spr 1.3.4
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 10, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Florian Mayer (fmayer)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/108111.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp (+107-44)
  • (added) llvm/test/Instrumentation/HWAddressSanitizer/mem-attr-declr.ll (+82)
diff --git a/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
index a7e7f9a570dac7..bdd0764e1e5e15 100644
--- a/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
@@ -54,6 +54,7 @@
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Debug.h"
 #include "llvm/Support/MD5.h"
+#include "llvm/Support/ModRef.h"
 #include "llvm/Support/RandomNumberGenerator.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/TargetParser/Triple.h"
@@ -301,7 +302,8 @@ class HWAddressSanitizer {
     initializeModule();
   }
 
-  void sanitizeFunction(Function &F, FunctionAnalysisManager &FAM);
+  bool sanitizeFunction(Function &F, FunctionAnalysisManager &FAM);
+  void unsanitizeFunction(Function &F);
 
 private:
   struct ShadowTagCheckInfo {
@@ -316,6 +318,8 @@ class HWAddressSanitizer {
                                           FunctionAnalysisManager &FAM) const;
   void initializeModule();
   void createHwasanCtorComdat();
+  void removeFnAttributes(Function *F);
+  void restoreFnAttributes(CallInst *CI);
 
   void initializeCallbacks(Module &M);
 
@@ -450,6 +454,13 @@ class HWAddressSanitizer {
   Value *StackBaseTag = nullptr;
   Value *CachedFP = nullptr;
   GlobalValue *ThreadPtrGlobal = nullptr;
+
+  struct ChangedFn {
+    std::optional<MemoryEffects> OriginalME;
+    SmallVector<size_t, 2> RemovedArgAttrs;
+  };
+
+  SmallMapVector<const Function *, ChangedFn, 2> ChangedFns;
 };
 
 } // end anonymous namespace
@@ -466,8 +477,15 @@ PreservedAnalyses HWAddressSanitizerPass::run(Module &M,
 
   HWAddressSanitizer HWASan(M, Options.CompileKernel, Options.Recover, SSI);
   auto &FAM = MAM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
-  for (Function &F : M)
-    HWASan.sanitizeFunction(F, FAM);
+  SmallVector<Function *, 5> Unsanitize;
+  for (Function &F : M) {
+    if (!HWASan.sanitizeFunction(F, FAM))
+      Unsanitize.emplace_back(&F);
+  }
+
+  for (Function *F : Unsanitize) {
+    HWASan.unsanitizeFunction(*F);
+  }
 
   PreservedAnalyses PA = PreservedAnalyses::none();
   // DominatorTreeAnalysis, PostDominatorTreeAnalysis, and LoopAnalysis
@@ -591,47 +609,81 @@ void HWAddressSanitizer::createHwasanCtorComdat() {
   appendToCompilerUsed(M, Dummy);
 }
 
-/// Module-level initialization.
-///
-/// inserts a call to __hwasan_init to the module's constructor list.
-void HWAddressSanitizer::initializeModule() {
-  LLVM_DEBUG(dbgs() << "Init " << M.getName() << "\n");
-  TargetTriple = Triple(M.getTargetTriple());
-
-  for (auto &F : M.functions()) {
-    // Remove memory attributes that are invalid with HWASan.
-    // HWASan checks read from shadow, which invalidates memory(argmem: *)
-    // Short granule checks on function arguments read from the argument memory
-    // (last byte of the granule), which invalidates writeonly.
-    //
-    // This is not only true for sanitized functions, because AttrInfer can
-    // infer those attributes on libc functions, which is not true if those
-    // are instrumented (Android) or intercepted.
-
-    // The API is weird. `onlyReadsMemory` actually means "does not write", and
-    // `onlyWritesMemory` actually means "does not read". So we reconstruct
-    // "accesses memory" && "does not read" <=> "writes".
-    bool Changed = false;
-    if (!F.doesNotAccessMemory()) {
-      bool WritesMemory = !F.onlyReadsMemory();
-      bool ReadsMemory = !F.onlyWritesMemory();
-      if ((WritesMemory && !ReadsMemory) || F.onlyAccessesArgMemory()) {
-        F.removeFnAttr(Attribute::Memory);
-        Changed = true;
-      }
+void HWAddressSanitizer::removeFnAttributes(Function *F) {
+  if (!F || ChangedFns.contains(F))
+    return;
+  // Remove memory attributes that are invalid with HWASan.
+  // HWASan checks read from shadow, which invalidates memory(argmem: *)
+  // Short granule checks on function arguments read from the argument memory
+  // (last byte of the granule), which invalidates writeonly.
+  //
+  // This is not only true for sanitized functions, because AttrInfer can
+  // infer those attributes on libc functions, which is not true if those
+  // are instrumented (Android) or intercepted.
+  //
+  // We might want to model HWASan shadow memory more opaquely to get rid of
+  // this problem altogether, by hiding the shadow memory write in an
+  // intrinsic, essentially like in the AArch64StackTagging pass. But that's
+  // for another day.
+
+  // The API is weird. `onlyReadsMemory` actually means "does not write", and
+  // `onlyWritesMemory` actually means "does not read". So we reconstruct
+  // "accesses memory" && "does not read" <=> "writes".
+  bool Changed = false;
+  if (!F->doesNotAccessMemory()) {
+    bool WritesMemory = !F->onlyReadsMemory();
+    bool ReadsMemory = !F->onlyWritesMemory();
+    if ((WritesMemory && !ReadsMemory) || F->onlyAccessesArgMemory()) {
+      ChangedFns[F].OriginalME = F->getMemoryEffects();
+      F->removeFnAttr(Attribute::Memory);
+      Changed = true;
     }
-    for (Argument &A : F.args()) {
+  }
+  if (UseShortGranules) {
+    for (Argument &A : F->args()) {
       if (A.hasAttribute(Attribute::WriteOnly)) {
-        Changed = true;
         A.removeAttr(Attribute::WriteOnly);
+        ChangedFns[F].RemovedArgAttrs.emplace_back(A.getArgNo());
+        Changed = true;
       }
     }
-    if (Changed) {
-      // nobuiltin makes sure later passes don't restore assumptions about
-      // the function.
-      F.addFnAttr(Attribute::NoBuiltin);
-    }
   }
+  if (Changed) {
+    // nobuiltin makes sure later passes don't restore assumptions about
+    // the function.
+    F->addFnAttr(Attribute::NoBuiltin);
+  }
+}
+
+void HWAddressSanitizer::restoreFnAttributes(CallInst *CI) {
+  auto It = ChangedFns.find(CI->getCalledFunction());
+  if (It == ChangedFns.end())
+    return;
+  // The memory effects are stored as a bitmap where a bit e.g. means
+  // "may write memory". CI->getMemoryEffects is the AND (i.e. intersection)
+  // of the bits of the CallInst and the underlying function. So, we removed
+  // the original ME from the function, which means the bitmap there all 1.
+  // Now we are adding the original mask back to the CallInst, which means
+  // CI->getMemoryEffects is now OriginalME & all 1 = OriginalME.
+  //
+  // Restoring the memory access is slightly dubious, because the called
+  // function will still read the memory. But because the function does not
+  // operate on the shadow memory, there should not be any difficulties with
+  // removing the shadow write, or reordering it.
+  ChangedFn &CFn = It->second;
+  if (CFn.OriginalME.has_value())
+    CI->setMemoryEffects(*CFn.OriginalME);
+  for (size_t I : CFn.RemovedArgAttrs)
+    CI->addAttributeAtIndex(I + AttributeList::FirstArgIndex,
+                            Attribute::WriteOnly);
+}
+
+/// Module-level initialization.
+///
+/// inserts a call to __hwasan_init to the module's constructor list.
+void HWAddressSanitizer::initializeModule() {
+  LLVM_DEBUG(dbgs() << "Init " << M.getName() << "\n");
+  TargetTriple = Triple(M.getTargetTriple());
 
   // x86_64 currently has two modes:
   // - Intel LAM (default)
@@ -895,6 +947,7 @@ void HWAddressSanitizer::getInterestingMemoryOperands(
       Interesting.emplace_back(I, ArgNo, false, Ty, Align(1));
     }
     maybeMarkSanitizerLibraryCallNoBuiltin(CI, &TLI);
+    removeFnAttributes(CI->getCalledFunction());
   }
 }
 
@@ -1593,16 +1646,23 @@ bool HWAddressSanitizer::selectiveInstrumentationShouldSkip(
   return Skip;
 }
 
-void HWAddressSanitizer::sanitizeFunction(Function &F,
+void HWAddressSanitizer::unsanitizeFunction(Function &F) {
+  for (auto &Inst : instructions(F)) {
+    if (CallInst *CI = dyn_cast<CallInst>(&Inst))
+      restoreFnAttributes(CI);
+  }
+}
+
+bool HWAddressSanitizer::sanitizeFunction(Function &F,
                                           FunctionAnalysisManager &FAM) {
   if (&F == HwasanCtorFunction)
-    return;
+    return false;
 
   if (!F.hasFnAttribute(Attribute::SanitizeHWAddress))
-    return;
+    return false;
 
   if (F.empty())
-    return;
+    return false;
 
   NumTotalFuncs++;
 
@@ -1610,7 +1670,9 @@ void HWAddressSanitizer::sanitizeFunction(Function &F,
       FAM.getResult<OptimizationRemarkEmitterAnalysis>(F);
 
   if (selectiveInstrumentationShouldSkip(F, FAM))
-    return;
+    return false;
+
+  removeFnAttributes(&F);
 
   NumInstrumentedFuncs++;
 
@@ -1653,7 +1715,7 @@ void HWAddressSanitizer::sanitizeFunction(Function &F,
 
   if (SInfo.AllocasToInstrument.empty() && OperandsToInstrument.empty() &&
       IntrinToInstrument.empty())
-    return;
+    return false;
 
   assert(!ShadowBase);
 
@@ -1702,6 +1764,7 @@ void HWAddressSanitizer::sanitizeFunction(Function &F,
   ShadowBase = nullptr;
   StackBaseTag = nullptr;
   CachedFP = nullptr;
+  return true;
 }
 
 void HWAddressSanitizer::instrumentGlobal(GlobalVariable *GV, uint8_t Tag) {
diff --git a/llvm/test/Instrumentation/HWAddressSanitizer/mem-attr-declr.ll b/llvm/test/Instrumentation/HWAddressSanitizer/mem-attr-declr.ll
new file mode 100644
index 00000000000000..d8965730b1e0b3
--- /dev/null
+++ b/llvm/test/Instrumentation/HWAddressSanitizer/mem-attr-declr.ll
@@ -0,0 +1,82 @@
+; Test that HWASan remove writeonly and memory(*) attributes from instrumented functions.
+; RUN: opt -S -passes=hwasan %s | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
+target triple = "aarch64-unknown-linux-android30"
+
+; CHECK: declare dso_local void @test_argmemwriteonly(ptr nocapture noundef) local_unnamed_addr #0
+declare dso_local void @test_argmemwriteonly(ptr nocapture noundef writeonly) local_unnamed_addr #0
+
+; CHECK: declare dso_local void @test_argmemwriteonly2(ptr nocapture noundef) local_unnamed_addr #0
+declare dso_local void @test_argmemwriteonly2(ptr nocapture noundef) local_unnamed_addr #0
+
+; CHECK: declare dso_local void @test_writeonly(ptr nocapture noundef) local_unnamed_addr #0
+declare dso_local void @test_writeonly(ptr nocapture noundef) local_unnamed_addr #1
+
+; CHECK: declare dso_local void @test_writeonly2(ptr nocapture noundef) local_unnamed_addr #0
+declare dso_local void @test_writeonly2(ptr nocapture noundef writeonly) local_unnamed_addr #1
+
+; CHECK: declare dso_local void @test_argmemreadonly(ptr nocapture noundef readonly) local_unnamed_addr #0
+declare dso_local void @test_argmemreadonly(ptr nocapture noundef readonly) local_unnamed_addr #2
+
+; CHECK: declare dso_local void @test_argmemreadonly2(ptr nocapture noundef) local_unnamed_addr #0
+declare dso_local void @test_argmemreadonly2(ptr nocapture noundef) local_unnamed_addr #2
+
+; CHECK: declare dso_local void @test_readonly(ptr nocapture noundef) local_unnamed_addr #1
+declare dso_local void @test_readonly(ptr nocapture noundef) local_unnamed_addr #3
+
+; CHECK: declare dso_local void @test_readonly2(ptr nocapture noundef readonly) local_unnamed_addr #1
+declare dso_local void @test_readonly2(ptr nocapture noundef readonly) local_unnamed_addr #3
+
+; CHECK: declare dso_local void @test_writeonly_notcalled_from_sanitized(ptr nocapture noundef readonly) local_unnamed_addr #2
+declare dso_local void @test_writeonly_notcalled_from_sanitized(ptr nocapture noundef readonly) local_unnamed_addr #1
+
+define dso_local void @test_sanitized(ptr %p) sanitize_hwaddress {
+entry:
+    call void @test_argmemwriteonly(ptr %p)
+    call void @test_argmemwriteonly2(ptr %p)
+    call void @test_writeonly(ptr %p)
+    call void @test_writeonly2(ptr %p)
+    call void @test_argmemreadonly(ptr %p)
+    call void @test_argmemreadonly2(ptr %p)
+    call void @test_readonly(ptr %p)
+    call void @test_readonly2(ptr %p)
+    ret void
+}
+
+define dso_local void @test_not_sanitized(ptr %p) {
+entry:
+; CHECK: call void @test_argmemwriteonly(ptr writeonly %p) #5
+    call void @test_argmemwriteonly(ptr %p)
+; CHECK: call void @test_argmemwriteonly2(ptr %p) #5
+    call void @test_argmemwriteonly2(ptr %p)
+; CHECK: call void @test_writeonly(ptr %p) #6
+    call void @test_writeonly(ptr %p)
+; CHECK: call void @test_writeonly2(ptr writeonly %p) #6
+    call void @test_writeonly2(ptr %p)
+; CHECK: call void @test_argmemreadonly(ptr %p) #7
+    call void @test_argmemreadonly(ptr %p)
+; CHECK: call void @test_argmemreadonly2(ptr %p) #7
+    call void @test_argmemreadonly2(ptr %p)
+; CHECK: call void @test_readonly(ptr %p)
+    call void @test_readonly(ptr %p)
+; CHECK: call void @test_readonly2(ptr %p)
+    call void @test_readonly2(ptr %p)
+; CHECK: call void @test_writeonly_notcalled_from_sanitized(ptr %p)
+    call void @test_writeonly_notcalled_from_sanitized(ptr %p)
+    ret void
+}
+
+; CHECK: attributes #0 = { nobuiltin uwtable }
+; CHECK: attributes #1 = { memory(read) uwtable }
+; CHECK: attributes #2 = { memory(write) uwtable }
+; CHECK: attributes #3 = { sanitize_hwaddress }
+; CHECK: attributes #4 = { nounwind }
+; CHECK: attributes #5 = { memory(argmem: write) }
+; CHECK: attributes #6 = { memory(write) }
+; CHECK: attributes #7 = { memory(argmem: read) }
+
+attributes #0 = { memory(argmem: write) uwtable }
+attributes #1 = { memory(write) uwtable }
+attributes #2 = { memory(argmem: read) uwtable }
+attributes #3 = { memory(read) uwtable }

Created using spr 1.3.4
Created using spr 1.3.4
@vitalybuka
Copy link
Collaborator

Can we put real part of the patch on hold to the moment when we get back to actively working on SelHWASAN?
It would be nice to see a difference.

fmayer added a commit to fmayer/llvm-project that referenced this pull request Sep 13, 2024
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants