-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[scudo] Move the chunk update into functions #83493
Conversation
@llvm/pr-subscribers-compiler-rt-sanitizer Author: None (ChiaHungDuan) ChangesMemory tagging is a flag which requires system reboot to enable. Which This change explictily splits the paths into different code blocks. This Full diff: https://github.com/llvm/llvm-project/pull/83493.diff 2 Files Affected:
diff --git a/compiler-rt/lib/scudo/standalone/allocator_common.h b/compiler-rt/lib/scudo/standalone/allocator_common.h
index 4d39956bf90503..2b77516ad11cae 100644
--- a/compiler-rt/lib/scudo/standalone/allocator_common.h
+++ b/compiler-rt/lib/scudo/standalone/allocator_common.h
@@ -53,7 +53,7 @@ template <class SizeClassAllocator> struct TransferBatch {
void moveNToArray(CompactPtrT *Array, u16 N) {
DCHECK_LE(N, Count);
memcpy(Array, Batch + Count - N, sizeof(Batch[0]) * N);
- Count -= N;
+ Count = static_cast<u16>(Count - N);
}
u16 getCount() const { return Count; }
bool isEmpty() const { return Count == 0U; }
diff --git a/compiler-rt/lib/scudo/standalone/combined.h b/compiler-rt/lib/scudo/standalone/combined.h
index fa6077384d9826..6d5997525063f1 100644
--- a/compiler-rt/lib/scudo/standalone/combined.h
+++ b/compiler-rt/lib/scudo/standalone/combined.h
@@ -401,133 +401,18 @@ class Allocator {
reportOutOfMemory(NeededSize);
}
- const uptr BlockUptr = reinterpret_cast<uptr>(Block);
- const uptr UnalignedUserPtr = BlockUptr + Chunk::getHeaderSize();
- const uptr UserPtr = roundUp(UnalignedUserPtr, Alignment);
-
- void *Ptr = reinterpret_cast<void *>(UserPtr);
- void *TaggedPtr = Ptr;
- if (LIKELY(ClassId)) {
- // We only need to zero or tag the contents for Primary backed
- // allocations. We only set tags for primary allocations in order to avoid
- // faulting potentially large numbers of pages for large secondary
- // allocations. We assume that guard pages are enough to protect these
- // allocations.
- //
- // FIXME: When the kernel provides a way to set the background tag of a
- // mapping, we should be able to tag secondary allocations as well.
- //
- // When memory tagging is enabled, zeroing the contents is done as part of
- // setting the tag.
- if (UNLIKELY(useMemoryTagging<Config>(Options))) {
- uptr PrevUserPtr;
- Chunk::UnpackedHeader Header;
- const uptr BlockSize = PrimaryT::getSizeByClassId(ClassId);
- const uptr BlockEnd = BlockUptr + BlockSize;
- // If possible, try to reuse the UAF tag that was set by deallocate().
- // For simplicity, only reuse tags if we have the same start address as
- // the previous allocation. This handles the majority of cases since
- // most allocations will not be more aligned than the minimum alignment.
- //
- // We need to handle situations involving reclaimed chunks, and retag
- // the reclaimed portions if necessary. In the case where the chunk is
- // fully reclaimed, the chunk's header will be zero, which will trigger
- // the code path for new mappings and invalid chunks that prepares the
- // chunk from scratch. There are three possibilities for partial
- // reclaiming:
- //
- // (1) Header was reclaimed, data was partially reclaimed.
- // (2) Header was not reclaimed, all data was reclaimed (e.g. because
- // data started on a page boundary).
- // (3) Header was not reclaimed, data was partially reclaimed.
- //
- // Case (1) will be handled in the same way as for full reclaiming,
- // since the header will be zero.
- //
- // We can detect case (2) by loading the tag from the start
- // of the chunk. If it is zero, it means that either all data was
- // reclaimed (since we never use zero as the chunk tag), or that the
- // previous allocation was of size zero. Either way, we need to prepare
- // a new chunk from scratch.
- //
- // We can detect case (3) by moving to the next page (if covered by the
- // chunk) and loading the tag of its first granule. If it is zero, it
- // means that all following pages may need to be retagged. On the other
- // hand, if it is nonzero, we can assume that all following pages are
- // still tagged, according to the logic that if any of the pages
- // following the next page were reclaimed, the next page would have been
- // reclaimed as well.
- uptr TaggedUserPtr;
- if (getChunkFromBlock(BlockUptr, &PrevUserPtr, &Header) &&
- PrevUserPtr == UserPtr &&
- (TaggedUserPtr = loadTag(UserPtr)) != UserPtr) {
- uptr PrevEnd = TaggedUserPtr + Header.SizeOrUnusedBytes;
- const uptr NextPage = roundUp(TaggedUserPtr, getPageSizeCached());
- if (NextPage < PrevEnd && loadTag(NextPage) != NextPage)
- PrevEnd = NextPage;
- TaggedPtr = reinterpret_cast<void *>(TaggedUserPtr);
- resizeTaggedChunk(PrevEnd, TaggedUserPtr + Size, Size, BlockEnd);
- if (UNLIKELY(FillContents != NoFill && !Header.OriginOrWasZeroed)) {
- // If an allocation needs to be zeroed (i.e. calloc) we can normally
- // avoid zeroing the memory now since we can rely on memory having
- // been zeroed on free, as this is normally done while setting the
- // UAF tag. But if tagging was disabled per-thread when the memory
- // was freed, it would not have been retagged and thus zeroed, and
- // therefore it needs to be zeroed now.
- memset(TaggedPtr, 0,
- Min(Size, roundUp(PrevEnd - TaggedUserPtr,
- archMemoryTagGranuleSize())));
- } else if (Size) {
- // Clear any stack metadata that may have previously been stored in
- // the chunk data.
- memset(TaggedPtr, 0, archMemoryTagGranuleSize());
- }
- } else {
- const uptr OddEvenMask =
- computeOddEvenMaskForPointerMaybe(Options, BlockUptr, ClassId);
- TaggedPtr = prepareTaggedChunk(Ptr, Size, OddEvenMask, BlockEnd);
- }
- storePrimaryAllocationStackMaybe(Options, Ptr);
- } else {
- Block = addHeaderTag(Block);
- Ptr = addHeaderTag(Ptr);
- if (UNLIKELY(FillContents != NoFill)) {
- // This condition is not necessarily unlikely, but since memset is
- // costly, we might as well mark it as such.
- memset(Block, FillContents == ZeroFill ? 0 : PatternFillByte,
- PrimaryT::getSizeByClassId(ClassId));
- }
- }
+ const uptr UserPtr = roundUp(
+ reinterpret_cast<uptr>(Block) + Chunk::getHeaderSize(), Alignment);
+ const uptr SizeOrUnusedBytes =
+ ClassId ? Size : SecondaryBlockEnd - (UserPtr + Size);
+
+ if (LIKELY(!useMemoryTagging<Config>(Options))) {
+ return initChunk(ClassId, Origin, Block, UserPtr, SizeOrUnusedBytes,
+ FillContents);
} else {
- Block = addHeaderTag(Block);
- Ptr = addHeaderTag(Ptr);
- if (UNLIKELY(useMemoryTagging<Config>(Options))) {
- storeTags(reinterpret_cast<uptr>(Block), reinterpret_cast<uptr>(Ptr));
- storeSecondaryAllocationStackMaybe(Options, Ptr, Size);
- }
- }
-
- Chunk::UnpackedHeader Header = {};
- if (UNLIKELY(UnalignedUserPtr != UserPtr)) {
- const uptr Offset = UserPtr - UnalignedUserPtr;
- DCHECK_GE(Offset, 2 * sizeof(u32));
- // The BlockMarker has no security purpose, but is specifically meant for
- // the chunk iteration function that can be used in debugging situations.
- // It is the only situation where we have to locate the start of a chunk
- // based on its block address.
- reinterpret_cast<u32 *>(Block)[0] = BlockMarker;
- reinterpret_cast<u32 *>(Block)[1] = static_cast<u32>(Offset);
- Header.Offset = (Offset >> MinAlignmentLog) & Chunk::OffsetMask;
+ return initChunkWithMemoryTagging(ClassId, Origin, Block, UserPtr, Size,
+ SizeOrUnusedBytes, FillContents);
}
- Header.ClassId = ClassId & Chunk::ClassIdMask;
- Header.State = Chunk::State::Allocated;
- Header.OriginOrWasZeroed = Origin & Chunk::OriginMask;
- Header.SizeOrUnusedBytes =
- (ClassId ? Size : SecondaryBlockEnd - (UserPtr + Size)) &
- Chunk::SizeOrUnusedBytesMask;
- Chunk::storeHeader(Cookie, Ptr, &Header);
-
- return TaggedPtr;
}
NOINLINE void deallocate(void *Ptr, Chunk::Origin Origin, uptr DeleteSize = 0,
@@ -1143,6 +1028,171 @@ class Allocator {
reinterpret_cast<uptr>(Ptr) - SizeOrUnusedBytes;
}
+ ALWAYS_INLINE void *initChunk(const uptr ClassId, const Chunk::Origin Origin,
+ void *Block, const uptr UserPtr,
+ const uptr SizeOrUnusedBytes,
+ const FillContentsMode FillContents) {
+ Block = addHeaderTag(Block);
+ // Only do content fill when it's from primary allocator because secondary
+ // allocator has filled the content.
+ if (ClassId != 0 && UNLIKELY(FillContents != NoFill)) {
+ // This condition is not necessarily unlikely, but since memset is
+ // costly, we might as well mark it as such.
+ memset(Block, FillContents == ZeroFill ? 0 : PatternFillByte,
+ PrimaryT::getSizeByClassId(ClassId));
+ }
+
+ Chunk::UnpackedHeader Header = {};
+
+ const uptr DefaultAlignedPtr =
+ reinterpret_cast<uptr>(Block) + Chunk::getHeaderSize();
+ if (UNLIKELY(DefaultAlignedPtr != UserPtr)) {
+ const uptr Offset = UserPtr - DefaultAlignedPtr;
+ DCHECK_GE(Offset, 2 * sizeof(u32));
+ // The BlockMarker has no security purpose, but is specifically meant for
+ // the chunk iteration function that can be used in debugging situations.
+ // It is the only situation where we have to locate the start of a chunk
+ // based on its block address.
+ reinterpret_cast<u32 *>(Block)[0] = BlockMarker;
+ reinterpret_cast<u32 *>(Block)[1] = static_cast<u32>(Offset);
+ Header.Offset = (Offset >> MinAlignmentLog) & Chunk::OffsetMask;
+ }
+
+ Header.ClassId = ClassId & Chunk::ClassIdMask;
+ Header.State = Chunk::State::Allocated;
+ Header.OriginOrWasZeroed = Origin & Chunk::OriginMask;
+ Header.SizeOrUnusedBytes = SizeOrUnusedBytes & Chunk::SizeOrUnusedBytesMask;
+ Chunk::storeHeader(Cookie, reinterpret_cast<void *>(addHeaderTag(UserPtr)),
+ &Header);
+
+ return reinterpret_cast<void *>(UserPtr);
+ }
+
+ NOINLINE void *
+ initChunkWithMemoryTagging(const uptr ClassId, const Chunk::Origin Origin,
+ void *Block, const uptr UserPtr, const uptr Size,
+ const uptr SizeOrUnusedBytes,
+ const FillContentsMode FillContents) {
+ const Options Options = Primary.Options.load();
+ DCHECK(useMemoryTagging<Config>(Options));
+
+ void *Ptr = reinterpret_cast<void *>(UserPtr);
+ void *TaggedPtr = Ptr;
+
+ if (LIKELY(ClassId)) {
+ // We only need to zero or tag the contents for Primary backed
+ // allocations. We only set tags for primary allocations in order to avoid
+ // faulting potentially large numbers of pages for large secondary
+ // allocations. We assume that guard pages are enough to protect these
+ // allocations.
+ //
+ // FIXME: When the kernel provides a way to set the background tag of a
+ // mapping, we should be able to tag secondary allocations as well.
+ //
+ // When memory tagging is enabled, zeroing the contents is done as part of
+ // setting the tag.
+
+ uptr PrevUserPtr;
+ Chunk::UnpackedHeader Header;
+ const uptr BlockSize = PrimaryT::getSizeByClassId(ClassId);
+ const uptr BlockUptr = reinterpret_cast<uptr>(Block);
+ const uptr BlockEnd = BlockUptr + BlockSize;
+ // If possible, try to reuse the UAF tag that was set by deallocate().
+ // For simplicity, only reuse tags if we have the same start address as
+ // the previous allocation. This handles the majority of cases since
+ // most allocations will not be more aligned than the minimum alignment.
+ //
+ // We need to handle situations involving reclaimed chunks, and retag
+ // the reclaimed portions if necessary. In the case where the chunk is
+ // fully reclaimed, the chunk's header will be zero, which will trigger
+ // the code path for new mappings and invalid chunks that prepares the
+ // chunk from scratch. There are three possibilities for partial
+ // reclaiming:
+ //
+ // (1) Header was reclaimed, data was partially reclaimed.
+ // (2) Header was not reclaimed, all data was reclaimed (e.g. because
+ // data started on a page boundary).
+ // (3) Header was not reclaimed, data was partially reclaimed.
+ //
+ // Case (1) will be handled in the same way as for full reclaiming,
+ // since the header will be zero.
+ //
+ // We can detect case (2) by loading the tag from the start
+ // of the chunk. If it is zero, it means that either all data was
+ // reclaimed (since we never use zero as the chunk tag), or that the
+ // previous allocation was of size zero. Either way, we need to prepare
+ // a new chunk from scratch.
+ //
+ // We can detect case (3) by moving to the next page (if covered by the
+ // chunk) and loading the tag of its first granule. If it is zero, it
+ // means that all following pages may need to be retagged. On the other
+ // hand, if it is nonzero, we can assume that all following pages are
+ // still tagged, according to the logic that if any of the pages
+ // following the next page were reclaimed, the next page would have been
+ // reclaimed as well.
+ uptr TaggedUserPtr;
+ if (getChunkFromBlock(BlockUptr, &PrevUserPtr, &Header) &&
+ PrevUserPtr == UserPtr &&
+ (TaggedUserPtr = loadTag(UserPtr)) != UserPtr) {
+ uptr PrevEnd = TaggedUserPtr + Header.SizeOrUnusedBytes;
+ const uptr NextPage = roundUp(TaggedUserPtr, getPageSizeCached());
+ if (NextPage < PrevEnd && loadTag(NextPage) != NextPage)
+ PrevEnd = NextPage;
+ TaggedPtr = reinterpret_cast<void *>(TaggedUserPtr);
+ resizeTaggedChunk(PrevEnd, TaggedUserPtr + Size, Size, BlockEnd);
+ if (UNLIKELY(FillContents != NoFill && !Header.OriginOrWasZeroed)) {
+ // If an allocation needs to be zeroed (i.e. calloc) we can normally
+ // avoid zeroing the memory now since we can rely on memory having
+ // been zeroed on free, as this is normally done while setting the
+ // UAF tag. But if tagging was disabled per-thread when the memory
+ // was freed, it would not have been retagged and thus zeroed, and
+ // therefore it needs to be zeroed now.
+ memset(TaggedPtr, 0,
+ Min(Size, roundUp(PrevEnd - TaggedUserPtr,
+ archMemoryTagGranuleSize())));
+ } else if (Size) {
+ // Clear any stack metadata that may have previously been stored in
+ // the chunk data.
+ memset(TaggedPtr, 0, archMemoryTagGranuleSize());
+ }
+ } else {
+ const uptr OddEvenMask =
+ computeOddEvenMaskForPointerMaybe(Options, BlockUptr, ClassId);
+ TaggedPtr = prepareTaggedChunk(Ptr, Size, OddEvenMask, BlockEnd);
+ }
+ storePrimaryAllocationStackMaybe(Options, Ptr);
+ } else {
+ Block = addHeaderTag(Block);
+ Ptr = addHeaderTag(Ptr);
+ storeTags(reinterpret_cast<uptr>(Block), reinterpret_cast<uptr>(Ptr));
+ storeSecondaryAllocationStackMaybe(Options, Ptr, Size);
+ }
+
+ Chunk::UnpackedHeader Header = {};
+
+ const uptr DefaultAlignedPtr =
+ reinterpret_cast<uptr>(Block) + Chunk::getHeaderSize();
+ if (UNLIKELY(DefaultAlignedPtr != UserPtr)) {
+ const uptr Offset = UserPtr - DefaultAlignedPtr;
+ DCHECK_GE(Offset, 2 * sizeof(u32));
+ // The BlockMarker has no security purpose, but is specifically meant for
+ // the chunk iteration function that can be used in debugging situations.
+ // It is the only situation where we have to locate the start of a chunk
+ // based on its block address.
+ reinterpret_cast<u32 *>(Block)[0] = BlockMarker;
+ reinterpret_cast<u32 *>(Block)[1] = static_cast<u32>(Offset);
+ Header.Offset = (Offset >> MinAlignmentLog) & Chunk::OffsetMask;
+ }
+
+ Header.ClassId = ClassId & Chunk::ClassIdMask;
+ Header.State = Chunk::State::Allocated;
+ Header.OriginOrWasZeroed = Origin & Chunk::OriginMask;
+ Header.SizeOrUnusedBytes = SizeOrUnusedBytes & Chunk::SizeOrUnusedBytesMask;
+ Chunk::storeHeader(Cookie, Ptr, &Header);
+
+ return TaggedPtr;
+ }
+
void quarantineOrDeallocateChunk(const Options &Options, void *TaggedPtr,
Chunk::UnpackedHeader *Header,
uptr Size) NO_THREAD_SAFETY_ANALYSIS {
|
This continues the work https://reviews.llvm.org/D159392 |
e3178e9
to
8f44836
Compare
I measured the cache miss rate by using simpleperf and running malloc-rss-benchmark (Remove some noises which may impact the result) for 20 times each. I don't see significant difference (0.5934682 vs 0.5927589, less than 0.5%). As mentioned in the comment, this change helps with readability in both source and asm levels. As a result, I think it's still a good reason to do the clean up. BTW, there were two changes and I put them in the same pull request to simplify the process |
8f44836
to
d59370d
Compare
// When memory tagging is enabled, zeroing the contents is done as part of | ||
// setting the tag. | ||
|
||
uptr PrevUserPtr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super small nit, but it would probably be best if this was down with the TaggedUserPtr declaration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
storeSecondaryAllocationStackMaybe(Options, Ptr, Size); | ||
} | ||
|
||
Chunk::UnpackedHeader Header = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the header storage code looks very similar in both paths. Should this be made into a separate function? It might be that it's not as similar as I think though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that they look similar but I may have to review them carefully (the differences between w/ and w/o memory tagging are usually nuanced). Thus I may prefer doing this in a separate CL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, move the code for writing the header back into allocate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is,
On non-mte path, the UserPtr
is used by header update and returned.
On mte path, header update requires UserPtr
to update the header and TaggedPtr
will be returned.
Currently, I'm not sure if I can use one to infer the value of another one (and I prefer not to return two variables or something like that). As a result, I would suggest that let me do another refactor on this and see if we can simplify the logic here and move the code structure only in this CL (try to have least logical changes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree do this in a follow-up.
|
||
void *BlockBegin; | ||
if (UNLIKELY(useMemoryTagging<Config>(Options))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to reverse the if and else clauses and check LIKELY(!useMemoryTagging...)?
It's a bit easier to read if the likely choice is first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. This also aligns the order in other place
const u8 PrevTag = extractTag(reinterpret_cast<uptr>(TaggedPtr)); | ||
storeDeallocationStackMaybe(Options, Ptr, PrevTag, Size); | ||
if (Header->ClassId) { | ||
if (!TSDRegistry.getDisableMemInit()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that the original code does this, but it would be better to have a single if instead of the two ifs since neither if has an else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
storeTags(reinterpret_cast<uptr>(Block), reinterpret_cast<uptr>(Ptr)); | ||
storeSecondaryAllocationStackMaybe(Options, Ptr, Size); | ||
} | ||
return initChunkWithMemoryTagging(ClassId, Origin, Block, UserPtr, Size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dangling else
statement after return
. Remove the else {
block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
void *TaggedPtr = Ptr; | ||
|
||
if (LIKELY(ClassId)) { | ||
// We only need to zero or tag the contents for Primary backed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're refactoring, is it possible to pull this out into a helper function, so that this reads something like:
if (primary) {
initPrimaryChunkWithTagging();
} else {
initSecondaryChunkWithTagging(); // It's 5 lines, so maybe leave it just as those 5 lines of code
}
// code to store header
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's a good idea to move the primary chunk initialization into a function. However, given the number of referenced variables (it's 10), instead of passing all the variables (even we can mark the function as always-inline) I'm thinking that maybe we also want to review and simplify the logic as well.
I added the comment to highlight what each block is doing. I'm planning to do further refactoring in a follow up CL, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to refactor later, yeah. It's just a huge function right now that I've always had a hard time reading, as it spans about three monitor pages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I've never understood this idea that code is more readable if split into multiple functions that are only called once. In my experience it generally makes the code less readable and less maintainable.
- If I am reading the function, I'm reading it because I need to know what it does. That means I need to read the function and its callees. Having to go-to-definition in an IDE, read the callee, go back, etc just adds mental overhead and requires readers to use IDEs. That's much more annoying than the function being three pages long. At least with the three page function, the three pages are in one place. With separate functions it's 3.1 pages long due to the function signatures and scattered in various places in the file.
- If I am modifying the function, the multiple functions just get in the way. For example if I now need some information in the callee, now I need to add arguments to the caller and callee for basically no reason other than the function having been arbitrarily split into multiple functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While reviewing this cl, trying to work through a three page long function has been painful. Breaking it up into smaller, easier to understand pieces really helps understand what's going on, because I only need to understand what each function does. For example, with a large function, variable names tend to be reused, so it can get confusing trying to figure out where the variable was set and what it means at page one versus page two or page three. You can fix this by never re-using the variable names, but at that point you are emulating breaking the function into smaller functions and might as well go all the way and refactor.
The advice to break functions is almost universal, but I have seen it go too far the other way. Adding a function that has two lines in it and is only used once, that doesn't make sense and should be folded into the caller. I don't think this change is doing that.
Now, if you don't think that this refactor is breaking up the function into easy to understand pieces, that would be a reason to keep modifying this. For example, if you don't think the comments make sense, or that the function partition is right, please mention it.
Otherwise, having had to read this very long function a number of times, I can tell you this new version makes it much easier to understand. Even trying to figure out the MTE enabled code path made it extra difficult to figure out what was happening. Having to go back and forth looking at the modifications done in the MTE enabled path can be confusing. So, this should really help understand the MTE and non-MTE paths alone, which makes this worth doing, in my estimation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can get confusing trying to figure out where the variable was set and what it means at page one versus page two or page three
Yeah, this is one of the things that I would need an IDE/clangd for (find variable definition/references), but the thing is that I would need an IDE for this regardless of whether the function was broken up. And for me, the split into multiple functions would make the situation worse because I need to manually track the variable across function calls (find the matching argument in the caller/callee) as well, since an IDE typically will only find definition/references within the local function. I think the fact that there's a lot of code to handle allocations is just an intrinsic property of the code needing to do a lot of stuff and it's not clear to me that moving it around like this really helps.
You can fix this by never re-using the variable names
I would personally be happier with a change like this. The code would still be friendly to IDEs and people who want to read the whole function and it would make it more obvious where the value for each variable comes from.
The advice to break functions is almost universal
Fair enough, I'm not necessarily claiming that my experience is universal, it could just be the way my brain works. If breaking functions really does make the code easiest to read for everyone else, I wouldn't object to it and I'll just put up with it being harder to read for me.
Now, if you don't think that this refactor is breaking up the function into easy to understand pieces, that would be a reason to keep modifying this. For example, if you don't think the comments make sense, or that the function partition is right, please mention it.
Yeah, my other comments are what I have to offer on that front. But since the question of how to structure code to make it easy to read is subjective and my personal preference would be to either not make this change at all or to make a different change like not re-using variables it's difficult to offer more feedback than that.
I still think this is a bad idea. People are going to update one copy of the code and forget to update the other one, which will cause all kinds of problems. |
I'm not sure if the title and the description are misleading so that you think we just create two copies which have bunch of copy-paste codes. I can update the them if that's the problem. So far, this CL only splits the codes that have independent logic (w/ and w/o memory tagging), other parts (like allocating blocks from Primary or Secondary) are still sharing the same code. It's not creating two copies. In addition, if someone changes the layout of header and they don't run mte build, they may still forget to update the mte-path, right? Then we should increase the test coverage. I would think that unittests should help us catching that no matter the mte path is textual inline or not. |
Yeah, it looked like you were copy-pasting more code than you actually were. But it looks like this change is mixed in with a number of other changes that appear to have been intended to improve performance. Since your experiments showed that this change is performance neutral, I don't think we should make those changes. Please also update the description to not talk about performance. |
void quarantineOrDeallocateChunk(const Options &Options, void *TaggedPtr, | ||
void *HeaderTaggedPtr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to pass this in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
&TaggedEnd); | ||
} | ||
} | ||
void *BlockBegin; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void *BlockBegin = getBlockBegin(untagPointer(Ptr), Header);
and then you can get rid of some complexity below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you are saying we can remove the code,
if (BypassQuarantine && allocatorSupportsMemoryTagging<Config>())
Ptr = untagPointer(Ptr);
BlockBegin = getBlockBegin(Ptr, Header);
Now if BypassQuarantine == false
then we don't call untagPointer()
on Ptr
, so I think it will change the behavior, or no?
Besides, the BlockBegin is overridden on mte path, I think it's fine to leave it as it is. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BlockBegin is unused on the branch of the code where BypassQuarantine is false, so it doesn't make a difference. And BlockBegin is evaluated in the same way in reTagBlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question is, it seems that we have the different logic here. If BypassQuarantine == true
&& allocatorSupportsMemoryTagging<Config>() == false
, it doesn't untag the pointer (the case of BypassQuarantine == false
was trying to mention the same thing).
Maybe it's something we can simplify but I would like to make it as close to the same logic as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If allocatorSupportsMemoryTagging<Config>() == false
then it shouldn't matter whether we untag or not. In that case, all pointers will be untagged so untagPointer
will be a no-op. The reason why the old code was checking allocatorSupportsMemoryTagging<Config>()
is as a minor optimization -- we don't need to emit the and
instruction if we know it'll always be a no-op. But the additional logic being added here outweighs the value of this optimization.
In other words, the logic was like that before because it was the simplest way to write the code before. If you're proposing to reorganize the code into a new structure, then it should be kept simple, but in a way that fits the new structure. If the goal of this change is to make the code more readable, then it shouldn't be making the code less readable by needlessly copying logic from the old version of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inline NORETURN uptr untagPointer(uptr Ptr) {
(void)Ptr;
UNREACHABLE("memory tagging not supported");
}
When allocatorSupportsMemoryTagging<Config>() == false
, untagPointer
leads to abort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. How about we add
uptr maybeUntagPointer(uptr Ptr) {
return allocatorSupportsMemoryTagging<Config>() ? untagPointer(Ptr) : Ptr;
}
and change the other callers of untagPointer
to call it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I was blocked by other stuff recently.
This is a little bit more than simply move the code structure. If we want to do that, I may suggest doing this in a follow up so that we can catch up if there's any logic mismatch in this CL and do further clean up later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, land this as is then.
@@ -1213,6 +1259,36 @@ class Allocator { | |||
} | |||
} | |||
|
|||
NOINLINE void *unTagBlock(const Options &Options, void *TaggedPtr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't "untag" the block, it retags it as part of deallocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to reTagBlock
storeSecondaryAllocationStackMaybe(Options, Ptr, Size); | ||
} | ||
|
||
Chunk::UnpackedHeader Header = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, move the code for writing the header back into allocate
.
The code paths for mte enabled and disabled were interleaving and which increases the difficulty of reading each path in both source level and assembly level. In this change, we move the parts that they have different logic into functions and minor refactors on the code structure.
fdbaf0a
to
f50a4bd
Compare
Squashed and rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
But you might want to wait until pcc answers the one outstanding question.
storeSecondaryAllocationStackMaybe(Options, Ptr, Size); | ||
} | ||
|
||
Chunk::UnpackedHeader Header = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree do this in a follow-up.
void *TaggedPtr = Ptr; | ||
|
||
if (LIKELY(ClassId)) { | ||
// We only need to zero or tag the contents for Primary backed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I've never understood this idea that code is more readable if split into multiple functions that are only called once. In my experience it generally makes the code less readable and less maintainable.
- If I am reading the function, I'm reading it because I need to know what it does. That means I need to read the function and its callees. Having to go-to-definition in an IDE, read the callee, go back, etc just adds mental overhead and requires readers to use IDEs. That's much more annoying than the function being three pages long. At least with the three page function, the three pages are in one place. With separate functions it's 3.1 pages long due to the function signatures and scattered in various places in the file.
- If I am modifying the function, the multiple functions just get in the way. For example if I now need some information in the callee, now I need to add arguments to the caller and callee for basically no reason other than the function having been arbitrarily split into multiple functions.
&TaggedEnd); | ||
} | ||
} | ||
void *BlockBegin; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BlockBegin is unused on the branch of the code where BypassQuarantine is false, so it doesn't make a difference. And BlockBegin is evaluated in the same way in reTagBlock.
Chunk::UnpackedHeader *Header, const uptr Size, | ||
bool BypassQuarantine) { | ||
DCHECK(useMemoryTagging<AllocatorConfig>(Options)); | ||
void *Ptr = HeaderTaggedPtr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get rid of this line and just call the argument Ptr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also rename reTagBlock
to retagBlock
. It makes the style consistent with retagPointer
.
Chunk::UnpackedHeader *Header, const uptr Size, | ||
bool BypassQuarantine) { | ||
DCHECK(useMemoryTagging<AllocatorConfig>(Options)); | ||
void *Ptr = HeaderTaggedPtr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
&TaggedEnd); | ||
} | ||
} | ||
void *BlockBegin; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question is, it seems that we have the different logic here. If BypassQuarantine == true
&& allocatorSupportsMemoryTagging<Config>() == false
, it doesn't untag the pointer (the case of BypassQuarantine == false
was trying to mention the same thing).
Maybe it's something we can simplify but I would like to make it as close to the same logic as possible.
Verified on MTE device again and fixed a left over pointer untagging. |
llvm#83493 slightly changed the order of computation of block addresses and pointers, causing the value of DefaultAlignedPtr to include the MTE tag. Move this computation earlier so it matches the old behavior. This fixes a UBSan failure in Trusty: secure os: UBSan: (overflow:-) external/scudo/standalone/combined.h:1070:35 secure os: Details: unsigned integer overflow: 8988807738704 - 144124176883594576 cannot be represented in type 'uptr'
llvm#83493 slightly changed the order of computation of block addresses and pointers, causing the value of DefaultAlignedPtr to include the MTE tag. Move this computation earlier so it matches the old behavior. This fixes a UBSan failure in Trusty: secure os: UBSan: (overflow:-) external/scudo/standalone/combined.h:1070:35 secure os: Details: unsigned integer overflow: 8988807738704 - 144124176883594576 cannot be represented in type 'uptr'
llvm#83493 slightly changed the order of computation of block addresses and pointers, causing the value of DefaultAlignedPtr to include the MTE tag. Move this computation earlier so it matches the old behavior. This fixes a UBSan failure in Trusty: secure os: UBSan: (overflow:-) external/scudo/standalone/combined.h:1070:35 secure os: Details: unsigned integer overflow: 8988807738704 - 144124176883594576 cannot be represented in type 'uptr'
#83493 slightly changed the order of computation of block addresses and pointers, causing the value of DefaultAlignedPtr to include the MTE tag. Move this computation earlier so it matches the old behavior. This fixes a UBSan failure in Trusty: secure os: UBSan: (overflow:-) external/scudo/standalone/combined.h:1070:35 secure os: Details: unsigned integer overflow: 8988807738704 - 144124176883594576 cannot be represented in type 'uptr'
llvm/llvm-project#83493 slightly changed the order of computation of block addresses and pointers, causing the value of DefaultAlignedPtr to include the MTE tag. Move this computation earlier so it matches the old behavior. This fixes a UBSan failure in Trusty: secure os: UBSan: (overflow:-) external/scudo/standalone/combined.h:1070:35 secure os: Details: unsigned integer overflow: 8988807738704 - 144124176883594576 cannot be represented in type 'uptr' GitOrigin-RevId: b17d44558ba4c30a3005089b334f68593d6a9c7c Change-Id: Ie86f195d79144e0539684a71dbedaa0c8b961729
… without tag (#92989) llvm/llvm-project#83493 slightly changed the order of computation of block addresses and pointers, causing the value of DefaultAlignedPtr to include the MTE tag. Move this computation earlier so it matches the old behavior. This fixes a UBSan failure in Trusty: secure os: UBSan: (overflow:-) external/scudo/standalone/combined.h:1070:35 secure os: Details: unsigned integer overflow: 8988807738704 - 144124176883594576 cannot be represented in type 'uptr' GitOrigin-RevId: 68d65af99a063e79222c790f5afcdcf2e58b38db Original-Revision: 01bc3bce33d2f804df1828387577bef23fbd8c7d Roller-URL: https://ci.chromium.org/b/8747120831581303473 CQ-Do-Not-Cancel-Tryjobs: true Change-Id: I78b348eeae725db1367d1580aec2d1e21e5550d0 Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/1053852
… without tag (#92989) llvm/llvm-project#83493 slightly changed the order of computation of block addresses and pointers, causing the value of DefaultAlignedPtr to include the MTE tag. Move this computation earlier so it matches the old behavior. This fixes a UBSan failure in Trusty: secure os: UBSan: (overflow:-) external/scudo/standalone/combined.h:1070:35 secure os: Details: unsigned integer overflow: 8988807738704 - 144124176883594576 cannot be represented in type 'uptr' GitOrigin-RevId: 68d65af99a063e79222c790f5afcdcf2e58b38db Original-Revision: 01bc3bce33d2f804df1828387577bef23fbd8c7d Roller-URL: https://ci.chromium.org/b/8747120831581303473 CQ-Do-Not-Cancel-Tryjobs: true Change-Id: I78b348eeae725db1367d1580aec2d1e21e5550d0 Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/1053852
…25be5ad0d Local branch amd-gfx 43125be Merged main:f97f039e0bb7bb60c9cc437f678059c5ee19c8da into amd-gfx:ce126627c1f0 Remote branch main 772b1b0 [scudo] Move the chunk update into functions (llvm#83493)
The code paths for mte enabled and disabled were interleaving and which increases the difficulty of reading each path in both source level and assembly level. In this change, we move the parts that they have different logic into functions and minor refactors on the code structure.