Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Fix a bug for PT_TLS segments getting loaded when they shouldn't. #98432

Merged
merged 2 commits into from
Jul 11, 2024

Conversation

clayborg
Copy link
Collaborator

PT_LOAD and PT_TLS segments are top level sections in the ObjectFileELF section list. The two segments can often have the same program header p_vaddr and p_paddr values and this can cause section load list issues in LLDB if we load the PT_TLS segments. What happens is the SectionLoadList::m_addr_to_sect, when a library is loaded, will first map one of the sections named "PT_LOAD[0]" with the load address that matches the p_vaddr entry from the program header. Then the "PT_TLS[0]" would come along and try to load this section at the same address. This would cause the "PT_LOAD[0]" section to be unloaded as the SectionLoadList::m_addr_to_sect would replace the value for the matching p_vaddr with the last section to be seen. The sizes of the PT_TLS and PT_LOAD that have the same p_vaddr value don't need to have the same byte size, so this could cause lookups to fail for an addresses in the "PT_LOAD[0]" section or any of its children if the offset is greater than the offset size of the PT_TLS segment. It could also cause us to incorrectly attribute addresses from the "PT_LOAD[0]" to the "PT_TLS[0]" segment when doing lookups for offset that are less than the size of the PT_TLS segment.

This fix stops us from loading PT_TLS segments in the section load lists and will prevent the bugs that resulted from this. No addresses the the DWARF refer to TLS data with a "file address" in any way. They all have TLS DWARF location expressions to locate these variables. We also don't have any support for having actual thread specific sections and having those sections resolve to something different for each thread, so there currently is no point in loading thread specific sections. Both the ObjectFileMachO and ObjectFileCOFF both ignore thread specific sections at the moment, so this brings the ObjectFileELF to parity with those plug-ins.

I added a test into an existing test to verify that things work as expected.

Prior to this fix with a real binary, the output of "target dump section-load-list" would look like this for the old LLDB:

// (lldb) target dump section-load-list
// addr = 0x0000000000000000, section = 0x55d46ab8c510: 0xfffffffffffffffd container        [0x0000000000000000-0x0000000000000628)  r--  0x00000000 0x00000628 0x00000000 a.out.PT_LOAD[0]
// addr = 0x0000000000001000, section = 0x55d46ab8b0c0: 0xfffffffffffffffc container        [0x0000000000001000-0x0000000000001185)  r-x  0x00001000 0x00000185 0x00000000 a.out.PT_LOAD[1]
// addr = 0x0000000000002000, section = 0x55d46ac040f0: 0xfffffffffffffffb container        [0x0000000000002000-0x00000000000020cc)  r--  0x00002000 0x000000cc 0x00000000 a.out.PT_LOAD[2]
// addr = 0x0000000000003db0, section = 0x55d46ab7cef0: 0xfffffffffffffff6 container        [0x0000000000003db0-0x0000000000003db4)  r--  0x00002db0 0x00000000 0x00000000 a.out.PT_TLS[0]

And this for the fixed LLDB:

// (lldb) target dump section-load-list
// addr = 0x0000000000000000, section = 0x105f0a9a8: 0xfffffffffffffffd container        [0x0000000000000000-0x0000000000000628)  r--  0x00000000 0x00000628 0x00000000 a.out.PT_LOAD[0]
// addr = 0x0000000000001000, section = 0x105f0adb8: 0xfffffffffffffffc container        [0x0000000000001000-0x0000000000001185)  r-x  0x00001000 0x00000185 0x00000000 a.out.PT_LOAD[1]
// addr = 0x0000000000002000, section = 0x105f0af48: 0xfffffffffffffffb container        [0x0000000000002000-0x00000000000020cc)  r--  0x00002000 0x000000cc 0x00000000 a.out.PT_LOAD[2]
// addr = 0x0000000000003db0, section = 0x105f0b078: 0xfffffffffffffffa container        [0x0000000000003db0-0x0000000000004028)  rw-  0x00002db0 0x00000274 0x00000000 a.out.PT_LOAD[3]

We can see that previously the "PT_LOAD[3]" segment would be removed from the section load list, and after the fix it remains and there is on PT_TLS in the loaded sections.

PT_LOAD and PT_TLS segments are top level sections in the ObjectFileELF section list. The two segments can often have the same program header p_vaddr and p_paddr values and this can cause section load list issues in LLDB if we load the PT_TLS segments. What happens is the SectionLoadList::m_addr_to_sect, when a library is loaded, will first map one of the sections named "PT_LOAD[0]" with the load address that matches the p_vaddr entry from the program header. Then the "PT_TLS[0]" would come along and try to load this section at the same address. This would cause the "PT_LOAD[0]" section to be unloaded as the SectionLoadList::m_addr_to_sect would replace the value for the matching p_vaddr with the last section to be seen. The sizes of the PT_TLS and PT_LOAD that have the same p_vaddr value don't need to have the same byte size, so this could cause lookups to fail for an addresses in the "PT_LOAD[0]" section or any of its children if the offset is greater than the offset size of the PT_TLS segment. It could also cause us to incorrectly attribute addresses from the "PT_LOAD[0]" to the "PT_TLS[0]" segment when doing lookups for offset  that are less than the size of the PT_TLS segment.

This fix stops us from loading PT_TLS segments in the section load lists and will prevent the bugs that resulted from this. No addresses the the DWARF refer to TLS data with a "file address" in any way. They all have TLS DWARF location expressions to locate these variables. We also don't have any support for having actual thread specific sections and having those sections resolve to something different for each thread, so there currently is no point in loading thread specific sections. Both the ObjectFileMachO and ObjectFileCOFF both ignore thread specific sections at the moment, so this brings the ObjectFileELF to parity with those plug-ins.

I added a test into an existing test to verify that things work as expected.

Prior to this fix with a real binary, the output of "target dump section-load-list" would look like this for the old LLDB:
```
// (lldb) target dump section-load-list
// addr = 0x0000000000000000, section = 0x55d46ab8c510: 0xfffffffffffffffd container        [0x0000000000000000-0x0000000000000628)  r--  0x00000000 0x00000628 0x00000000 a.out.PT_LOAD[0]
// addr = 0x0000000000001000, section = 0x55d46ab8b0c0: 0xfffffffffffffffc container        [0x0000000000001000-0x0000000000001185)  r-x  0x00001000 0x00000185 0x00000000 a.out.PT_LOAD[1]
// addr = 0x0000000000002000, section = 0x55d46ac040f0: 0xfffffffffffffffb container        [0x0000000000002000-0x00000000000020cc)  r--  0x00002000 0x000000cc 0x00000000 a.out.PT_LOAD[2]
// addr = 0x0000000000003db0, section = 0x55d46ab7cef0: 0xfffffffffffffff6 container        [0x0000000000003db0-0x0000000000003db4)  r--  0x00002db0 0x00000000 0x00000000 a.out.PT_TLS[0]
```
And this for the fixed LLDB:
```
// (lldb) target dump section-load-list
// addr = 0x0000000000000000, section = 0x105f0a9a8: 0xfffffffffffffffd container        [0x0000000000000000-0x0000000000000628)  r--  0x00000000 0x00000628 0x00000000 a.out.PT_LOAD[0]
// addr = 0x0000000000001000, section = 0x105f0adb8: 0xfffffffffffffffc container        [0x0000000000001000-0x0000000000001185)  r-x  0x00001000 0x00000185 0x00000000 a.out.PT_LOAD[1]
// addr = 0x0000000000002000, section = 0x105f0af48: 0xfffffffffffffffb container        [0x0000000000002000-0x00000000000020cc)  r--  0x00002000 0x000000cc 0x00000000 a.out.PT_LOAD[2]
// addr = 0x0000000000003db0, section = 0x105f0b078: 0xfffffffffffffffa container        [0x0000000000003db0-0x0000000000004028)  rw-  0x00002db0 0x00000274 0x00000000 a.out.PT_LOAD[3]
```
We can see that previously the "PT_LOAD[3]" segment would be removed from the section load list, and after the fix it remains and there is on PT_TLS in the loaded sections.
@clayborg clayborg added the lldb label Jul 11, 2024
@clayborg clayborg requested a review from labath July 11, 2024 05:01
@clayborg clayborg self-assigned this Jul 11, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 11, 2024

@llvm/pr-subscribers-lldb

Author: Greg Clayton (clayborg)

Changes

PT_LOAD and PT_TLS segments are top level sections in the ObjectFileELF section list. The two segments can often have the same program header p_vaddr and p_paddr values and this can cause section load list issues in LLDB if we load the PT_TLS segments. What happens is the SectionLoadList::m_addr_to_sect, when a library is loaded, will first map one of the sections named "PT_LOAD[0]" with the load address that matches the p_vaddr entry from the program header. Then the "PT_TLS[0]" would come along and try to load this section at the same address. This would cause the "PT_LOAD[0]" section to be unloaded as the SectionLoadList::m_addr_to_sect would replace the value for the matching p_vaddr with the last section to be seen. The sizes of the PT_TLS and PT_LOAD that have the same p_vaddr value don't need to have the same byte size, so this could cause lookups to fail for an addresses in the "PT_LOAD[0]" section or any of its children if the offset is greater than the offset size of the PT_TLS segment. It could also cause us to incorrectly attribute addresses from the "PT_LOAD[0]" to the "PT_TLS[0]" segment when doing lookups for offset that are less than the size of the PT_TLS segment.

This fix stops us from loading PT_TLS segments in the section load lists and will prevent the bugs that resulted from this. No addresses the the DWARF refer to TLS data with a "file address" in any way. They all have TLS DWARF location expressions to locate these variables. We also don't have any support for having actual thread specific sections and having those sections resolve to something different for each thread, so there currently is no point in loading thread specific sections. Both the ObjectFileMachO and ObjectFileCOFF both ignore thread specific sections at the moment, so this brings the ObjectFileELF to parity with those plug-ins.

I added a test into an existing test to verify that things work as expected.

Prior to this fix with a real binary, the output of "target dump section-load-list" would look like this for the old LLDB:

// (lldb) target dump section-load-list
// addr = 0x0000000000000000, section = 0x55d46ab8c510: 0xfffffffffffffffd container        [0x0000000000000000-0x0000000000000628)  r--  0x00000000 0x00000628 0x00000000 a.out.PT_LOAD[0]
// addr = 0x0000000000001000, section = 0x55d46ab8b0c0: 0xfffffffffffffffc container        [0x0000000000001000-0x0000000000001185)  r-x  0x00001000 0x00000185 0x00000000 a.out.PT_LOAD[1]
// addr = 0x0000000000002000, section = 0x55d46ac040f0: 0xfffffffffffffffb container        [0x0000000000002000-0x00000000000020cc)  r--  0x00002000 0x000000cc 0x00000000 a.out.PT_LOAD[2]
// addr = 0x0000000000003db0, section = 0x55d46ab7cef0: 0xfffffffffffffff6 container        [0x0000000000003db0-0x0000000000003db4)  r--  0x00002db0 0x00000000 0x00000000 a.out.PT_TLS[0]

And this for the fixed LLDB:

// (lldb) target dump section-load-list
// addr = 0x0000000000000000, section = 0x105f0a9a8: 0xfffffffffffffffd container        [0x0000000000000000-0x0000000000000628)  r--  0x00000000 0x00000628 0x00000000 a.out.PT_LOAD[0]
// addr = 0x0000000000001000, section = 0x105f0adb8: 0xfffffffffffffffc container        [0x0000000000001000-0x0000000000001185)  r-x  0x00001000 0x00000185 0x00000000 a.out.PT_LOAD[1]
// addr = 0x0000000000002000, section = 0x105f0af48: 0xfffffffffffffffb container        [0x0000000000002000-0x00000000000020cc)  r--  0x00002000 0x000000cc 0x00000000 a.out.PT_LOAD[2]
// addr = 0x0000000000003db0, section = 0x105f0b078: 0xfffffffffffffffa container        [0x0000000000003db0-0x0000000000004028)  rw-  0x00002db0 0x00000274 0x00000000 a.out.PT_LOAD[3]

We can see that previously the "PT_LOAD[3]" segment would be removed from the section load list, and after the fix it remains and there is on PT_TLS in the loaded sections.


Full diff: https://github.com/llvm/llvm-project/pull/98432.diff

2 Files Affected:

  • (modified) lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp (+14)
  • (modified) lldb/test/Shell/ObjectFile/ELF/PT_TLS-overlap-PT_LOAD.yaml (+24-1)
diff --git a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
index 5c6b475044be5..51bd34e95c77d 100644
--- a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
+++ b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
@@ -717,6 +717,20 @@ bool ObjectFileELF::SetLoadAddress(Target &target, lldb::addr_t value,
         // Iterate through the object file sections to find all of the sections
         // that have SHF_ALLOC in their flag bits.
         SectionSP section_sp(section_list->GetSectionAtIndex(sect_idx));
+
+        // PT_TLS segments can have the same p_vaddr and p_paddr as other
+        // PT_LOAD segments so we shouldn't load them. If we do load them, then
+        // the SectionLoadList will incorrectly fill in the instance variable
+        // SectionLoadList::m_addr_to_sect with the same address as a PT_LOAD
+        // segment and we won't be able to resolve addresses in the PT_LOAD
+        // segment whose p_vaddr entry matches that of the PT_TLS. Any variables
+        // that appear in the PT_TLS segments get resolved by the DWARF
+        // expressions. If this ever changes we will need to fix all object
+        // file plug-ins, but until then, we don't want PT_TLS segments to
+        // remove the entry from SectionLoadList::m_addr_to_sect when we call
+        // SetSectionLoadAddress() below.
+        if (section_sp->IsThreadSpecific())
+          continue;
         if (section_sp->Test(SHF_ALLOC) ||
             section_sp->GetType() == eSectionTypeContainer) {
           lldb::addr_t load_addr = section_sp->GetFileAddress();
diff --git a/lldb/test/Shell/ObjectFile/ELF/PT_TLS-overlap-PT_LOAD.yaml b/lldb/test/Shell/ObjectFile/ELF/PT_TLS-overlap-PT_LOAD.yaml
index ea711ec2197fb..0b6ca3b281740 100644
--- a/lldb/test/Shell/ObjectFile/ELF/PT_TLS-overlap-PT_LOAD.yaml
+++ b/lldb/test/Shell/ObjectFile/ELF/PT_TLS-overlap-PT_LOAD.yaml
@@ -1,8 +1,25 @@
 # Overlapping PT_LOAD and PT_TLS segments should be able to exist side by side.
 
+# When an ELF file contains both PT_LOAD and PT_TLS segmentsq where the PT_TLS
+# segment has the same p_vaddr and p_paddr as a PT_LOAD segment, this
+# was causing LLDB, when loading a ELF object file at an address, to overwrite
+# the section load address for a PT_LOAD that shares the same p_vaddr value in
+# the section load list's addr to section map for this code. This test ensures
+# that no PT_TLS segments get loaded and can't interfere with real segments we
+# need to resolved as all access to thread specific memory is only done via
+# DWARF location expressions. We also don't have any code that loads any thread
+# specific segments at a different address for different threads, so there is
+# no reason currently to try and load thread specific segments.
+
 # RUN: yaml2obj %s -o %t
 # RUN: lldb-test object-file %t | FileCheck %s
-# RUN: %lldb %t -o "image lookup -a 0x1000" -b | FileCheck --check-prefix=LOOKUP %s
+# RUN: %lldb %t -b \
+# RUN:       -o "image lookup -a 0x1000" \
+# RUN:       -o "target modules load --file %t --slide 0" \
+# RUN:       -o "image lookup -a 0x1000" \
+# RUN:       -o "target dump section-load-list" \
+# RUN:       | FileCheck --check-prefix=LOOKUP %s
+
 
 # CHECK:        Index: 0
 # CHECK-NEXT:   ID: 0xffffffffffffffff
@@ -28,6 +45,12 @@
 
 # LOOKUP-LABEL: image lookup -a 0x1000
 # LOOKUP:       Address: {{.*}}.PT_LOAD[0]..data + 0)
+# LOOKUP:       target modules load
+# LOOKUP:       image lookup -a 0x1000
+# LOOKUP:       Address: {{.*}}.PT_LOAD[0]..data + 0)
+# LOOKUP:       target dump section-load-list
+# LOOKUP:       PT_TLS-overlap-PT_LOAD.yaml.tmp.PT_LOAD[0]
+# LOOKUP-NOT:   PT_TLS-overlap-PT_LOAD.yaml.tmp.PT_TLS[0]
 
 !ELF
 FileHeader:

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Coincidentally, I was just working on a bug where this was (a part of) the problem. :)

@@ -1,8 +1,25 @@
# Overlapping PT_LOAD and PT_TLS segments should be able to exist side by side.

# When an ELF file contains both PT_LOAD and PT_TLS segmentsq where the PT_TLS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"PT_TLS segments" remove the "q" typo.

@@ -1,8 +1,25 @@
# Overlapping PT_LOAD and PT_TLS segments should be able to exist side by side.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe clarify this by adding "in an object file." on the end. So it's clear that lldb is not making use of them, just able to ignore them as the comment below explains.

As it is it could be taken as "exist side by side in lldb", which is the opposite of what we're doing here.

# RUN: -o "target modules load --file %t --slide 0" \
# RUN: -o "image lookup -a 0x1000" \
# RUN: -o "target dump section-load-list" \
# RUN: | FileCheck --check-prefix=LOOKUP %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be clearer if you put this command after the CHECK lines.

@clayborg clayborg merged commit 55b1410 into llvm:main Jul 11, 2024
4 of 5 checks passed
@clayborg clayborg deleted the tls-noload branch July 11, 2024 16:18
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
…'t. (llvm#98432)

PT_LOAD and PT_TLS segments are top level sections in the ObjectFileELF
section list. The two segments can often have the same program header
p_vaddr and p_paddr values and this can cause section load list issues
in LLDB if we load the PT_TLS segments. What happens is the
SectionLoadList::m_addr_to_sect, when a library is loaded, will first
map one of the sections named "PT_LOAD[0]" with the load address that
matches the p_vaddr entry from the program header. Then the "PT_TLS[0]"
would come along and try to load this section at the same address. This
would cause the "PT_LOAD[0]" section to be unloaded as the
SectionLoadList::m_addr_to_sect would replace the value for the matching
p_vaddr with the last section to be seen. The sizes of the PT_TLS and
PT_LOAD that have the same p_vaddr value don't need to have the same
byte size, so this could cause lookups to fail for an addresses in the
"PT_LOAD[0]" section or any of its children if the offset is greater
than the offset size of the PT_TLS segment. It could also cause us to
incorrectly attribute addresses from the "PT_LOAD[0]" to the "PT_TLS[0]"
segment when doing lookups for offset that are less than the size of the
PT_TLS segment.

This fix stops us from loading PT_TLS segments in the section load lists
and will prevent the bugs that resulted from this. No addresses the the
DWARF refer to TLS data with a "file address" in any way. They all have
TLS DWARF location expressions to locate these variables. We also don't
have any support for having actual thread specific sections and having
those sections resolve to something different for each thread, so there
currently is no point in loading thread specific sections. Both the
ObjectFileMachO and ObjectFileCOFF both ignore thread specific sections
at the moment, so this brings the ObjectFileELF to parity with those
plug-ins.

I added a test into an existing test to verify that things work as
expected.

Prior to this fix with a real binary, the output of "target dump
section-load-list" would look like this for the old LLDB:
```
// (lldb) target dump section-load-list
// addr = 0x0000000000000000, section = 0x55d46ab8c510: 0xfffffffffffffffd container        [0x0000000000000000-0x0000000000000628)  r--  0x00000000 0x00000628 0x00000000 a.out.PT_LOAD[0]
// addr = 0x0000000000001000, section = 0x55d46ab8b0c0: 0xfffffffffffffffc container        [0x0000000000001000-0x0000000000001185)  r-x  0x00001000 0x00000185 0x00000000 a.out.PT_LOAD[1]
// addr = 0x0000000000002000, section = 0x55d46ac040f0: 0xfffffffffffffffb container        [0x0000000000002000-0x00000000000020cc)  r--  0x00002000 0x000000cc 0x00000000 a.out.PT_LOAD[2]
// addr = 0x0000000000003db0, section = 0x55d46ab7cef0: 0xfffffffffffffff6 container        [0x0000000000003db0-0x0000000000003db4)  r--  0x00002db0 0x00000000 0x00000000 a.out.PT_TLS[0]
```
And this for the fixed LLDB:
```
// (lldb) target dump section-load-list
// addr = 0x0000000000000000, section = 0x105f0a9a8: 0xfffffffffffffffd container        [0x0000000000000000-0x0000000000000628)  r--  0x00000000 0x00000628 0x00000000 a.out.PT_LOAD[0]
// addr = 0x0000000000001000, section = 0x105f0adb8: 0xfffffffffffffffc container        [0x0000000000001000-0x0000000000001185)  r-x  0x00001000 0x00000185 0x00000000 a.out.PT_LOAD[1]
// addr = 0x0000000000002000, section = 0x105f0af48: 0xfffffffffffffffb container        [0x0000000000002000-0x00000000000020cc)  r--  0x00002000 0x000000cc 0x00000000 a.out.PT_LOAD[2]
// addr = 0x0000000000003db0, section = 0x105f0b078: 0xfffffffffffffffa container        [0x0000000000003db0-0x0000000000004028)  rw-  0x00002db0 0x00000274 0x00000000 a.out.PT_LOAD[3]
```
We can see that previously the "PT_LOAD[3]" segment would be removed
from the section load list, and after the fix it remains and there is on
PT_TLS in the loaded sections.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants