Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb][DWARFASTParserClang] Prevent unnamed bitfield creation in the presence of overlapping fields #108343

Merged
merged 6 commits into from
Sep 12, 2024

Conversation

Michael137
Copy link
Member

@Michael137 Michael137 commented Sep 12, 2024

This bug surfaced after #105865 (currently reverted, but blocked on this to be relanded).

Because Clang doesn't emit DW_TAG_members for unnamed bitfields, LLDB has to make an educated guess about whether they existed in the source. It does so by checking whether there is a gap between where the last field ended and the currently parsed field starts. In the example test case, the empty field padding was folded into the storage of data. Because the bit_offset of padding is 0x0 and its DW_AT_byte_size is 0x1, LLDB thinks the field ends at 0x1 (not quite because we first round the size to a word size, but this is an implementation detail), erroneously deducing that there's a gap between flag and padding.

This patch adds the notion of "effective field end", which accounts for fields that share storage. It is set to the end of the storage that the two fields occupy. Then we use this to check for gaps in the unnamed bitfield creation logic.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 12, 2024

@llvm/pr-subscribers-lldb

Author: Michael Buch (Michael137)

Changes

This bug surfaced after #105865 (currently reverted, but blocked on this to be relanded).

Because Clang doesn't emit DW_TAG_members for unnamed bitfields, LLDB has to make an educated guess about whether they existed in the source. It does so by checking whether there is a gap between where the last field began and the currently parsed field starts. In the example test case, the empty field padding was folded into the storage of data. Because the bit_offset of padding is 0x0 and its DW_AT_byte_size is 0x1, LLDB thinks the field ends at 0x1 (not quite because we first round the size to a word size, but this is an implementation detail), erroneously deducing that there's a gap between flag and padding.

This patch adds the notion of "effective field end", which accounts for fields that share storage. Its value is the end of the storage that the two fields occupy. Then we use this to check for gaps in the unnamed bitfield creation logic.


Full diff: https://github.com/llvm/llvm-project/pull/108343.diff

3 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+10-4)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h (+31)
  • (modified) lldb/test/Shell/SymbolFile/DWARF/no_unique_address-with-bitfields.cpp (-6)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 5b9de6f1566b05..88511e036fd8c2 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -2932,14 +2932,20 @@ void DWARFASTParserClang::ParseSingleMember(
     last_field_info = this_field_info;
     last_field_info.SetIsBitfield(true);
   } else {
-    last_field_info.bit_offset = field_bit_offset;
+    FieldInfo this_field_info{.is_bitfield = false};
+    this_field_info.bit_offset = field_bit_offset;
 
+    // TODO: we shouldn't silently ignore the bit_size if we fail
+    //       to GetByteSize.
     if (std::optional<uint64_t> clang_type_size =
             member_type->GetByteSize(nullptr)) {
-      last_field_info.bit_size = *clang_type_size * character_width;
+      this_field_info.bit_size = *clang_type_size * character_width;
     }
 
-    last_field_info.SetIsBitfield(false);
+    if (this_field_info.GetFieldEnd() < last_field_info.GetFieldEnd())
+      this_field_info.SetEffectiveFieldEnd(last_field_info.GetFieldEnd());
+
+    last_field_info = this_field_info;
   }
 
   // Don't turn artificial members such as vtable pointers into real FieldDecls
@@ -3738,7 +3744,7 @@ void DWARFASTParserClang::AddUnnamedBitfieldToRecordTypeIfNeeded(
     const FieldInfo &current_field) {
   // TODO: get this value from target
   const uint64_t word_width = 32;
-  uint64_t last_field_end = previous_field.bit_offset + previous_field.bit_size;
+  uint64_t last_field_end = previous_field.GetEffectiveFieldEnd();
 
   if (!previous_field.IsBitfield()) {
     // The last field was not a bit-field...
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
index 3809ee912cec6f..4ff464ce26c548 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
@@ -258,9 +258,27 @@ class DWARFASTParserClang : public lldb_private::plugin::dwarf::DWARFASTParser {
 
 private:
   struct FieldInfo {
+    ///< Size in bits that this field occopies. Can but
+    ///< need not be the DW_AT_bit_size of the field.
     uint64_t bit_size = 0;
+
+    ///< Offset of this field in bits from the beginning
+    ///< of the containing struct. Can but need not
+    ///< be the DW_AT_data_bit_offset of the field.
     uint64_t bit_offset = 0;
+
+    ///< In case this field is folded into the storage
+    ///< of a previous member's storage (for example
+    ///< with [[no_unique_address]]), the effective field
+    ///< end is the offset in bits from the beginning of
+    ///< the containing struct where the field we were
+    ///< folded into ended.
+    std::optional<uint64_t> effective_field_end;
+
+    ///< Set to 'true' if this field is a bit-field.
     bool is_bitfield = false;
+
+    ///< Set to 'true' if this field is DW_AT_artificial.
     bool is_artificial = false;
 
     FieldInfo() = default;
@@ -276,6 +294,19 @@ class DWARFASTParserClang : public lldb_private::plugin::dwarf::DWARFASTParser {
       // bit offset than any previous bitfield + size.
       return (bit_size + bit_offset) <= next_bit_offset;
     }
+
+    /// Returns the offset in bits of where the storage this field
+    /// occupies ends.
+    uint64_t GetFieldEnd() const { return bit_size + bit_offset; }
+
+    void SetEffectiveFieldEnd(uint64_t val) { effective_field_end = val; }
+
+    /// If this field was folded into storage of a previous field,
+    /// returns the offset in bits of where that storage ends. Otherwise,
+    /// returns the regular field end (see \ref GetFieldEnd).
+    uint64_t GetEffectiveFieldEnd() const {
+      return effective_field_end.value_or(GetFieldEnd());
+    }
   };
 
   /// Parsed form of all attributes that are relevant for parsing type members.
diff --git a/lldb/test/Shell/SymbolFile/DWARF/no_unique_address-with-bitfields.cpp b/lldb/test/Shell/SymbolFile/DWARF/no_unique_address-with-bitfields.cpp
index 1c9cc36a711b47..b1672a384c9fe4 100644
--- a/lldb/test/Shell/SymbolFile/DWARF/no_unique_address-with-bitfields.cpp
+++ b/lldb/test/Shell/SymbolFile/DWARF/no_unique_address-with-bitfields.cpp
@@ -1,7 +1,3 @@
-// LLDB currently erroneously adds an unnamed bitfield
-// into the AST when an overlapping no_unique_address
-// field precedes a bitfield.
-
 // RUN: %clang --target=x86_64-apple-macosx -c -gdwarf -o %t %s
 // RUN: %lldb %t \
 // RUN:   -o "target var global" \
@@ -12,8 +8,6 @@
 // CHECK:      CXXRecordDecl {{.*}} struct Foo definition
 // CHECK:      |-FieldDecl {{.*}} data 'char[5]'
 // CHECK-NEXT: |-FieldDecl {{.*}} padding 'Empty'
-// CHECK-NEXT: |-FieldDecl {{.*}} 'int'
-// CHECK-NEXT: | `-IntegerLiteral {{.*}} 'int' 8
 // CHECK-NEXT: `-FieldDecl {{.*}} sloc> flag 'unsigned long'
 // CHECK-NEXT:   `-IntegerLiteral {{.*}} 'int' 1
 

@Michael137
Copy link
Member Author

This logic is getting very finicky, so I wouldn't be surprised if there are cases not covered by this. Will have a think

Copy link

github-actions bot commented Sep 12, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -258,9 +258,27 @@ class DWARFASTParserClang : public lldb_private::plugin::dwarf::DWARFASTParser {

private:
struct FieldInfo {
///< Size in bits that this field occupies. Can but
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// comment
decl d;

or

decl d; ///<comment 

Copy link
Collaborator

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we're in heuristic territory I think we can only get a hold of this problem with lots of tests.

@Michael137
Copy link
Member Author

Once we're in heuristic territory I think we can only get a hold of this problem with lots of tests.

Added more test in the latest iteration (the ones marked FIXME were still misbehaving on top-of-tree, although in other ways...but we still asserted in the expression evaluator, so don't think we really regress with this patch).

There are some cases where it's pretty much impossible to differentiate between tail padding and unnamed bitfields. So as long as we don't emit those bitfields into DWARF I think we have to resort to this.

@adrian-prantl
Copy link
Collaborator

Why don't we just make the unnamed fields explicit in DWARF and solve the problem the correct way?
(We'll still needs this hack to support other/older compilers though)

@Michael137 Michael137 merged commit a6a547f into llvm:main Sep 12, 2024
7 checks passed
@pranavk
Copy link
Contributor

pranavk commented Sep 12, 2024

Hmm, this seems to be breaking LLVM build:

error: initialization of non-aggregate type 'FieldInfo' with a designated initializer list
 |     FieldInfo this_field_info{.is_bitfield = false};

@kazutakahirata
Copy link
Contributor

Hmm, this seems to be breaking LLVM build:

error: initialization of non-aggregate type 'FieldInfo' with a designated initializer list
 |     FieldInfo this_field_info{.is_bitfield = false};

FWIW, with clang-16.0.6 as the host compiler, I get:

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp:2935:31: error: designated initializers are a C++20 extension [-Werror,-Wc++20-designator]
    FieldInfo this_field_info{.is_bitfield = false};
                              ^
1 error generated.

@kazutakahirata
Copy link
Contributor

I've fixed the warning with 5d17293.

@Michael137
Copy link
Member Author

I've fixed the warning with 5d17293.

Thanks!

Michael137 added a commit that referenced this pull request Oct 3, 2024
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants