Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lld-macho] Ensure all sections in __TEXT get thunks if necessary #87818

Closed
wants to merge 1 commit into from

Conversation

speednoisemovement
Copy link
Contributor

This is a proposed fix for issue #52767. It has some flaws, but I wanted to put it out for discussion.

We decide whether a section has thunk callsites by comparing its size with the branch range. This can fail in the case of sections that are themselves smaller than the branch range, but need to branch to addresses in other sections that are farther than the branch range away. We ran into this in Chrome ASAN builds with the __lcxx_override section (see https://crbug.com/326898585). Putting the section closer to __text, or before __stubs doesn't help since __text is large enough on its own to require thunks for branches into it.

Probably the right way to do this is to check if the total size of text sections that can serve as branch targets is bigger than the range, but I couldn't think of a nice way to do this. This change sets a bit on the segment if one section is found to need thunks. Sections processed afterwards in the same segment will take the thunks branch regardless of their size.

This makes the assumption that the biggest section comes first (probably true almost always) and that a single section is bigger than the branch range (probably not true at the margins).

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 5, 2024

@llvm/pr-subscribers-lld-macho

@llvm/pr-subscribers-lld

Author: Leonard Grey (speednoisemovement)

Changes

This is a proposed fix for issue #52767. It has some flaws, but I wanted to put it out for discussion.

We decide whether a section has thunk callsites by comparing its size with the branch range. This can fail in the case of sections that are themselves smaller than the branch range, but need to branch to addresses in other sections that are farther than the branch range away. We ran into this in Chrome ASAN builds with the __lcxx_override section (see https://crbug.com/326898585). Putting the section closer to __text, or before __stubs doesn't help since __text is large enough on its own to require thunks for branches into it.

Probably the right way to do this is to check if the total size of text sections that can serve as branch targets is bigger than the range, but I couldn't think of a nice way to do this. This change sets a bit on the segment if one section is found to need thunks. Sections processed afterwards in the same segment will take the thunks branch regardless of their size.

This makes the assumption that the biggest section comes first (probably true almost always) and that a single section is bigger than the branch range (probably not true at the margins).


Full diff: https://github.com/llvm/llvm-project/pull/87818.diff

3 Files Affected:

  • (modified) lld/MachO/ConcatOutputSection.cpp (+12-2)
  • (modified) lld/MachO/OutputSegment.h (+1)
  • (modified) lld/test/MachO/arm64-thunks.s (+14)
diff --git a/lld/MachO/ConcatOutputSection.cpp b/lld/MachO/ConcatOutputSection.cpp
index 279423720be9d5..cd87f9cc4c3caf 100644
--- a/lld/MachO/ConcatOutputSection.cpp
+++ b/lld/MachO/ConcatOutputSection.cpp
@@ -127,10 +127,20 @@ bool TextOutputSection::needsThunks() const {
   uint64_t isecAddr = addr;
   for (ConcatInputSection *isec : inputs)
     isecAddr = alignToPowerOf2(isecAddr, isec->align) + isec->getSize();
-  if (isecAddr - addr + in.stubs->getSize() <=
-      std::min(target->backwardBranchRange, target->forwardBranchRange))
+  // Other sections besides __text might be small enough to pass this
+  // test but nevertheless need thunks for calling into oher sections.
+  // An imperfect heuristic to use in this case is that if a section
+  // we've already processed in this segment needs thunks, so do the
+  // rest.
+  bool needsThunks = parent && parent->needsThunks;
+  if (!needsThunks &&
+      isecAddr - addr + in.stubs->getSize() <=
+          std::min(target->backwardBranchRange, target->forwardBranchRange))
     return false;
   // Yes, this program is large enough to need thunks.
+  if (parent) {
+    parent->needsThunks = true;
+  }
   for (ConcatInputSection *isec : inputs) {
     for (Reloc &r : isec->relocs) {
       if (!target->hasAttr(r.type, RelocAttrBits::BRANCH))
diff --git a/lld/MachO/OutputSegment.h b/lld/MachO/OutputSegment.h
index 7a0c4a2065a15b..5f39734ffdbd7a 100644
--- a/lld/MachO/OutputSegment.h
+++ b/lld/MachO/OutputSegment.h
@@ -57,6 +57,7 @@ class OutputSegment {
   uint32_t initProt = 0;
   uint32_t flags = 0;
   uint8_t index;
+  bool needsThunks = false;
 
   llvm::TinyPtrVector<Defined *> segmentStartSymbols;
   llvm::TinyPtrVector<Defined *> segmentEndSymbols;
diff --git a/lld/test/MachO/arm64-thunks.s b/lld/test/MachO/arm64-thunks.s
index 0e4157ece72de2..ea40effd696f59 100644
--- a/lld/test/MachO/arm64-thunks.s
+++ b/lld/test/MachO/arm64-thunks.s
@@ -7,6 +7,7 @@
 ## (3) a second thunk is created when the first one goes out of range
 ## (4) early calls to a dylib stub use a thunk, and later calls the stub
 ##     directly
+## (5) Thunks are created for all sections in the text segment with branches.
 ## Notes:
 ## 0x4000000 = 64 Mi = half the magnitude of the forward-branch range
 
@@ -168,6 +169,10 @@
 
 # CHECK: [[#%x, NAN_PAGE + NAN_OFFSET]] <__stubs>:
 
+# CHECK: Disassembly of section __TEXT,__lcxx_override:
+# CHECK: <_z>:
+# CHECK:  bl 0x[[#%x, A_THUNK_0]] <_a.thunk.0>
+
 .subsections_via_symbols
 .text
 
@@ -300,3 +305,12 @@ _main:
   bl _h
   bl ___nan
   ret
+
+.section __TEXT,__lcxx_override,regular,pure_instructions
+
+.globl _z
+.no_dead_strip _z
+.p2align 2
+_z:
+  bl _a
+  ret

@speednoisemovement
Copy link
Contributor Author

So this doesn't totally do it; Chromium ASAN browser_tests still blows up on calls from new (in __lcxx_override) to __stubs. This is since we assume that stubs comes after any calls into it (

if (funcSym->isInStubs() && callVA >= stubsInRangeVA) {
).

So, contra the original description, moving the section closer to and/or before stubs does help in this specific case; it's just not sufficient to solve the entire problem.

I think we want to refactor how we do this, maybe by ensuring all __TEXT sections with jumps are packed together. It might make sense to consider the iterative algorithm ELF uses too.

@speednoisemovement
Copy link
Contributor Author

Dropping this in favor of #99052

speednoisemovement added a commit that referenced this pull request Jul 23, 2024
This supersedes #87818 and
fixes #52767

When calculating arm64 thunks, we make a few assumptions that may not
hold when considering code sections outside of `__text`:

1. That a section needs thunks only if its size is larger than the
branch range.
2. That any calls into `__stubs` are necessarily forward jumps (that is,
the section with the jump is ordered before `__stubs`)

Sections like this exist in the wild, most prominently the
`__lcxx_overrides` section introduced in
#69498

This change:
- Ensures that if one section in `__TEXT` gets thunks, all of them do.
- Makes all code sections in `__TEXT` contiguous (and guaranteed to be
placed before `__stubs`)
sgundapa pushed a commit to sgundapa/upstream_effort that referenced this pull request Jul 23, 2024
This supersedes llvm#87818 and
fixes llvm#52767

When calculating arm64 thunks, we make a few assumptions that may not
hold when considering code sections outside of `__text`:

1. That a section needs thunks only if its size is larger than the
branch range.
2. That any calls into `__stubs` are necessarily forward jumps (that is,
the section with the jump is ordered before `__stubs`)

Sections like this exist in the wild, most prominently the
`__lcxx_overrides` section introduced in
llvm#69498

This change:
- Ensures that if one section in `__TEXT` gets thunks, all of them do.
- Makes all code sections in `__TEXT` contiguous (and guaranteed to be
placed before `__stubs`)
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
This supersedes #87818 and
fixes #52767

When calculating arm64 thunks, we make a few assumptions that may not
hold when considering code sections outside of `__text`:

1. That a section needs thunks only if its size is larger than the
branch range.
2. That any calls into `__stubs` are necessarily forward jumps (that is,
the section with the jump is ordered before `__stubs`)

Sections like this exist in the wild, most prominently the
`__lcxx_overrides` section introduced in
#69498

This change:
- Ensures that if one section in `__TEXT` gets thunks, all of them do.
- Makes all code sections in `__TEXT` contiguous (and guaranteed to be
placed before `__stubs`)

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251191
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants