-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend life of variables in DiagComparison
in ExprConstant
#79522
Conversation
@llvm/pr-subscribers-clang Author: Tacet (AdvenamTacet) ChangesThis commit makes two variables static extending their life span. This patch is designed to address the issue of buildbots failing when AddressSanitizer's (ASan) short string annotations are enabled. It's esentially same as:
This patch makes Short string annotations PR (reverted): Buildbots (failure) output: While buildbots should not fail with proposed changes, we still should investigate why buildbots were failing with ASan short string annotations turned on. StringRef objects (made from those strings) can potentially change their contents unexpectedly or even (potentially) use of freed memory may happen. That interpretation is only my educated guess, I still didn't understand exactly why those buildbots are failing. Full diff: https://github.com/llvm/llvm-project/pull/79522.diff 1 Files Affected:
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index f1d07d022b25848..75cd16c0ae63d28 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -13288,9 +13288,20 @@ EvaluateComparisonBinaryOperator(EvalInfo &Info, const BinaryOperator *E,
// Reject differing bases from the normal codepath; we special-case
// comparisons to null.
if (!HasSameBase(LHSValue, RHSValue)) {
- auto DiagComparison = [&] (unsigned DiagID, bool Reversed = false) {
- std::string LHS = LHSValue.toString(Info.Ctx, E->getLHS()->getType());
- std::string RHS = RHSValue.toString(Info.Ctx, E->getRHS()->getType());
+ auto DiagComparison = [&](unsigned DiagID, bool Reversed = false) {
+ static std::string LHS, RHS;
+ // FIXME: To prevent the use of variables beyond their lifetime, we have
+ // made them static. However, this approach may not fully address the
+ // underlying issue. StringRef objects (made from those strings) can
+ // potentially change their contents unexpectedly.
+ // Or potentially use of freed memory may happen. Therefore, further
+ // investigation is required to ensure that making those variables
+ // static effectively resolves the problem.
+ // We should investigate why buildbots were failing with ASan short string
+ // annotations turned on. Related PR:
+ // https://github.com/llvm/llvm-project/pull/79049
+ LHS = LHSValue.toString(Info.Ctx, E->getLHS()->getType());
+ RHS = RHSValue.toString(Info.Ctx, E->getRHS()->getType());
Info.FFDiag(E, DiagID)
<< (Reversed ? RHS : LHS) << (Reversed ? LHS : RHS);
return false;
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
This commit makes two variables static extending their life span. This patch is designed to address the issue of buildbots failing when AddressSanitizer's (ASan) short string annotations are enabled. It's esentially same as: - llvm#79489 however, it's less likely to solve the real problem as those strings change (aren't `const`). I suspect that there may be use after end of life bug (in StringRef), but it requires confirmation. In that case, one alternative solution, which unfortunately results in memory leaks, is to always allocate new strings instead of overwriting existing (static) ones. This approach would prevent potential data corruption, but I don't suggest it in this PR. This patch makes `Clang :: SemaCXX/builtins.cpp` test pass with short string annotations (ASan). With llvm#79489 it fixes known problems with buildbots, while running with short string annotations. However, the potential issue still requires more investigation therefore FIXME comment is added in that patch. Short string annotations PR (reverted): - llvm#79049 Buildbots (failure) output: - https://lab.llvm.org/buildbot/#/builders/5/builds/40364/steps/9/logs/stdio While buildbots should not fail with proposed changes, we still should investigate why buildbots were failing with ASan short string annotations turned on. StringRef objects (made from those strings) can potentially change their contents unexpectedly or even (potentially) use of freed memory may happen. That interpretation is only my educated guess, I still didn't understand exactly why those buildbots are failing.
804b9c1
to
dc25384
Compare
This commit makes two variables static. That makes two buildbot tests pass with short string annotations. I suspect that there may be use after end of life bug and it's fixed by this change, but it requires confirmation. Short string annotations PR (reverted): - #79049 Tests fixed with this PR: ``` LLVM :: Transforms/Inline/cgscc-inline-replay.ll LLVM :: Transforms/SampleProfile/inline-replay.ll ``` Buildbot output: https://lab.llvm.org/buildbot/#/builders/5/builds/40364/steps/9/logs/stdio This PR does not resolve a problem with `Clang :: SemaCXX/builtins.cpp`, related PR is: - #79522
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM: - llvm#79489 - llvm#79522 Additionaly annotations were updated (but it shouldn't have any impact on anything): - llvm#79292 Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang. Read PRs above and below for details. --- Previous description: Originally merged here: llvm#75882 Reverted here: llvm#78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - llvm#79065 - llvm#79066 Problematic code from `UniqueFunctionBase` for example: ```cpp // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm#72677 Requires to pass CI without fails: - llvm#75845 - llvm#75858 Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to understand what's going on here.
I suspect something gets tripped by the ternary.
A temporary fix might be to assign
std::string first = Reversed ? RHS : LHS;
std::string second = Reversed ? LHS : RHS;
And use that in the diag.
We should also add a std::string&&
overload to the diagnostic engine's operator <<
to avoid extra copies, although that's mostly a luxury
@cor3ntin Thx for your comment! I looked at ternary operator at the very beginning, but discarded this direction as I was unable to create a small example reproducing the error. After your comment I started looking at it again. We really should understand what is happening here. Assuming that there is no use after end of life, buffer overflows etc. by looking at shadow memory:
and backtrace:
One potential option I see are incorrect annotations. For testing, I will write code not using |
I rewrote all functions with turned off instrumentation (I removed I'm also trying to reproduce the error with stand alone code, but without luck so far. So it makes me think that the bug may be completely different and unrelated, but I don't know where to look next. |
std::string LHS = LHSValue.toString(Info.Ctx, E->getLHS()->getType()); | ||
std::string RHS = RHSValue.toString(Info.Ctx, E->getRHS()->getType()); | ||
auto DiagComparison = [&](unsigned DiagID, bool Reversed = false) { | ||
static std::string LHS, RHS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not const, how can we make them static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not const, but we update their value every time visiting the function. Assigning value is moved to the next line, which is executed every time.
It is a workaround, but I'm unsure if a good one. (However, it solved the problem with buildbots.)
Code suggested by @cor3ntin works as well for a reason I don't understand.
It's more confusing than I initially thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @cor3ntin I think we need to understand the problem better.
I think popping into a debugger would likely help here.
Agree as well, I changed PR to WIP. But I don't really know where to look now.
@shafik What is the correct way of building a single test, so we have access to a stand-alone binary? I did want to use a debugger, copy pasted many dependencies into one file, it even compiled, but it didn't reproduce the error. I'm happy to get some help here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we figured out what was the issue, this patch should not be needed
It's not needed. |
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM: - llvm#79489 - llvm#79522 Additionaly annotations were updated (but it shouldn't have any impact on anything): - llvm#79292 Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang. Read PRs above and below for details. --- Previous description: Originally merged here: llvm#75882 Reverted here: llvm#78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - llvm#79065 - llvm#79066 Problematic code from `UniqueFunctionBase` for example: ```cpp // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm#72677 Requires to pass CI without fails: - llvm#75845 - llvm#75858 Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM: - llvm#79489 - llvm#79522 Additionaly annotations were updated (but it shouldn't have any impact on anything): - llvm#79292 Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang. Read PRs above and below for details. --- Previous description: Originally merged here: llvm#75882 Reverted here: llvm#78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - llvm#79065 - llvm#79066 Problematic code from `UniqueFunctionBase` for example: ```cpp // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm#72677 Requires to pass CI without fails: - llvm#75845 - llvm#75858 Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM: - llvm#79489 - llvm#79522 Additionaly annotations were updated (but it shouldn't have any impact on anything): - llvm#79292 Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang. Read PRs above and below for details. --- Previous description: Originally merged here: llvm#75882 Reverted here: llvm#78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - llvm#79065 - llvm#79066 Problematic code from `UniqueFunctionBase` for example: ```cpp // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm#72677 Requires to pass CI without fails: - llvm#75845 - llvm#75858 Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
This commit makes two variables static extending their life span. This patch is designed to address the issue of buildbots failing when AddressSanitizer's (ASan) short string annotations are enabled. It's esentially same as:
ReplayInlineAdvisor
#79489however, it's less likely to solve the real problem as those strings change (aren'tconst
). I suspect that there may be use after end of life bug (in StringRef), but it requires confirmation. In that case, one alternative solution, which unfortunately results in memory leaks, is to always allocate new strings instead of overwriting existing (static) ones. This approach would prevent potential data corruption, but I don't suggest it in this PR.This patch makes
Clang :: SemaCXX/builtins.cpp
test pass with short string annotations (ASan). With #79489 it fixes known problems with buildbots, while running with short string annotations. However, the potential issue still requires more investigation therefore FIXME comment is added in that patch.Short string annotations PR (reverted):
Buildbots (failure) output:
While buildbots should not fail with proposed changes, we still should investigate why buildbots were failing with ASan short string annotations turned on. StringRef objects (made from those strings)
can potentially change their contents unexpectedly or even (potentially) use of freed memory may happen. That interpretation is only my educated guess, I still didn't understand exactly why those buildbots are failing.Edit: I crossed parts of original description which are no longer likely based on discussion below. Reasoning in #79489 could be wrong and may be related to the same underlying issue as here.