This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM:
- llvm#79489
- llvm#79522
Additionaly annotations were updated (but it shouldn't have any impact on anything):
- llvm#79292
Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang.
Read PRs above and below for details.
---
Previous description:
Originally merged here: llvm#75882
Reverted here: llvm#78627
Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error.
Fixes are implemented in:
- llvm#79065
- llvm#79066
Problematic code from `UniqueFunctionBase` for example:
```cpp
// In debug builds, we also scribble across the rest of the storage.
memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize);
```
---
Original description:
This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case).
Originally suggested here: https://reviews.llvm.org/D147680
String annotations added here: llvm#72677
Requires to pass CI without fails:
- llvm#75845
- llvm#75858
Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations.
Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec.
This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds.
The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability.
If you have any questions, please email:
[email protected]
[email protected]