Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[scudo] Added LRU eviction policy to secondary cache. #99409

Merged
merged 1 commit into from
Jul 23, 2024

Conversation

JoshuaMBa
Copy link
Contributor

The logic for emptying the cache now follows an LRU eviction policy. When the cache is full on any given free operation, the oldest entry in the cache is evicted, and the memory associated with that cache entry is unmapped.

Finding empty cache entries is now a constant operation with the use of a stack of available cache entries.

Through the LRU structure, the cache retrieval algorithm now only iterates through valid entries of the cache. Furthermore, the retrieval algorithm will first search cache entries that have not been decommitted (i.e. madvise() has not been called on their corresponding memory chunks) to reduce the likelihood of returning a memory chunk to the user that would induce a page fault.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 17, 2024

@llvm/pr-subscribers-libc

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Joshua Baehring (JoshuaMBa)

Changes

The logic for emptying the cache now follows an LRU eviction policy. When the cache is full on any given free operation, the oldest entry in the cache is evicted, and the memory associated with that cache entry is unmapped.

Finding empty cache entries is now a constant operation with the use of a stack of available cache entries.

Through the LRU structure, the cache retrieval algorithm now only iterates through valid entries of the cache. Furthermore, the retrieval algorithm will first search cache entries that have not been decommitted (i.e. madvise() has not been called on their corresponding memory chunks) to reduce the likelihood of returning a memory chunk to the user that would induce a page fault.


Full diff: https://github.com/llvm/llvm-project/pull/99409.diff

1 Files Affected:

  • (modified) compiler-rt/lib/scudo/standalone/secondary.h (+119-40)
diff --git a/compiler-rt/lib/scudo/standalone/secondary.h b/compiler-rt/lib/scudo/standalone/secondary.h
index 9a8e53be388b7..b34e9379c5b3a 100644
--- a/compiler-rt/lib/scudo/standalone/secondary.h
+++ b/compiler-rt/lib/scudo/standalone/secondary.h
@@ -19,6 +19,7 @@
 #include "stats.h"
 #include "string_utils.h"
 #include "thread_annotations.h"
+#include "vector.h"
 
 namespace scudo {
 
@@ -73,12 +74,18 @@ static inline void unmap(LargeBlock::Header *H) {
 }
 
 namespace {
+
 struct CachedBlock {
+  static constexpr u16 CacheIndexMax = UINT16_MAX;
+  static constexpr u16 InvalidEntry = CacheIndexMax;
+
   uptr CommitBase = 0;
   uptr CommitSize = 0;
   uptr BlockBegin = 0;
   MemMapT MemMap = {};
   u64 Time = 0;
+  u16 Next = 0;
+  u16 Prev = 0;
 
   bool isValid() { return CommitBase != 0; }
 
@@ -188,10 +195,11 @@ template <typename Config> class MapAllocatorCache {
     Str->append("Stats: CacheRetrievalStats: SuccessRate: %u/%u "
                 "(%zu.%02zu%%)\n",
                 SuccessfulRetrieves, CallsToRetrieve, Integral, Fractional);
-    for (CachedBlock Entry : Entries) {
-      if (!Entry.isValid())
-        continue;
-      Str->append("StartBlockAddress: 0x%zx, EndBlockAddress: 0x%zx, "
+    Str->append("Cache Entry Dump (Most Recent -> Least Recent):\n");
+
+    for (u32 I = LRUHead; I != CachedBlock::InvalidEntry; I = Entries[I].Next) {
+      CachedBlock &Entry = Entries[I];
+      Str->append("  StartBlockAddress: 0x%zx, EndBlockAddress: 0x%zx, "
                   "BlockSize: %zu %s\n",
                   Entry.CommitBase, Entry.CommitBase + Entry.CommitSize,
                   Entry.CommitSize, Entry.Time == 0 ? "[R]" : "");
@@ -202,6 +210,10 @@ template <typename Config> class MapAllocatorCache {
   static_assert(Config::getDefaultMaxEntriesCount() <=
                     Config::getEntriesArraySize(),
                 "");
+  // Ensure the cache entry array size fits in the LRU list Next and Prev
+  // index fields
+  static_assert(Config::getEntriesArraySize() <= CachedBlock::CacheIndexMax,
+                "");
 
   void init(s32 ReleaseToOsInterval) NO_THREAD_SAFETY_ANALYSIS {
     DCHECK_EQ(EntriesCount, 0U);
@@ -213,18 +225,30 @@ template <typename Config> class MapAllocatorCache {
     if (Config::getDefaultReleaseToOsIntervalMs() != INT32_MIN)
       ReleaseToOsInterval = Config::getDefaultReleaseToOsIntervalMs();
     setOption(Option::ReleaseInterval, static_cast<sptr>(ReleaseToOsInterval));
+
+    // The cache is initially empty
+    LRUHead = CachedBlock::InvalidEntry;
+    LRUTail = CachedBlock::InvalidEntry;
+
+    // Available entries will be retrieved starting from the beginning of the
+    // Entries array
+    AvailableHead = 0;
+    for (u32 I = 0; I < Config::getEntriesArraySize() - 1; I++)
+      Entries[I].Next = static_cast<u16>(I + 1);
+
+    Entries[Config::getEntriesArraySize() - 1].Next = CachedBlock::InvalidEntry;
   }
 
   void store(const Options &Options, LargeBlock::Header *H) EXCLUDES(Mutex) {
     if (!canCache(H->CommitSize))
       return unmap(H);
 
-    bool EntryCached = false;
-    bool EmptyCache = false;
     const s32 Interval = atomic_load_relaxed(&ReleaseToOsIntervalMs);
     const u64 Time = getMonotonicTimeFast();
     const u32 MaxCount = atomic_load_relaxed(&MaxEntriesCount);
     CachedBlock Entry;
+    Vector<MemMapT, 1U> EvictionMemMaps;
+
     Entry.CommitBase = H->CommitBase;
     Entry.CommitSize = H->CommitSize;
     Entry.BlockBegin = reinterpret_cast<uptr>(H + 1);
@@ -254,6 +278,7 @@ template <typename Config> class MapAllocatorCache {
         // read Options and when we locked Mutex. We can't insert our entry into
         // the quarantine or the cache because the permissions would be wrong so
         // just unmap it.
+        Entry.MemMap.unmap(Entry.MemMap.getBase(), Entry.MemMap.getCapacity());
         break;
       }
       if (Config::getQuarantineSize() && useMemoryTagging<Config>(Options)) {
@@ -269,30 +294,27 @@ template <typename Config> class MapAllocatorCache {
           OldestTime = Entry.Time;
         Entry = PrevEntry;
       }
-      if (EntriesCount >= MaxCount) {
-        if (IsFullEvents++ == 4U)
-          EmptyCache = true;
-      } else {
-        for (u32 I = 0; I < MaxCount; I++) {
-          if (Entries[I].isValid())
-            continue;
-          if (I != 0)
-            Entries[I] = Entries[0];
-          Entries[0] = Entry;
-          EntriesCount++;
-          if (OldestTime == 0)
-            OldestTime = Entry.Time;
-          EntryCached = true;
-          break;
-        }
+
+      // All excess entries are evicted from the cache
+      while (EntriesCount >= MaxCount) {
+        // Save MemMaps of evicted entries to perform unmap outside of lock
+        EvictionMemMaps.push_back(Entries[LRUTail].MemMap);
+        remove(LRUTail);
       }
+
+      insert(Entry);
+
+      if (OldestTime == 0)
+        OldestTime = Entry.Time;
     } while (0);
-    if (EmptyCache)
-      empty();
-    else if (Interval >= 0)
+
+    for (MemMapT &EvictMemMap : EvictionMemMaps)
+      EvictMemMap.unmap(EvictMemMap.getBase(), EvictMemMap.getCapacity());
+
+    if (Interval >= 0) {
+      // TODO: Add ReleaseToOS logic to LRU algorithm
       releaseOlderThan(Time - static_cast<u64>(Interval) * 1000000);
-    if (!EntryCached)
-      Entry.MemMap.unmap(Entry.MemMap.getBase(), Entry.MemMap.getCapacity());
+    }
   }
 
   bool retrieve(Options Options, uptr Size, uptr Alignment, uptr HeadersSize,
@@ -312,9 +334,8 @@ template <typename Config> class MapAllocatorCache {
         return false;
       u32 OptimalFitIndex = 0;
       uptr MinDiff = UINTPTR_MAX;
-      for (u32 I = 0; I < MaxCount; I++) {
-        if (!Entries[I].isValid())
-          continue;
+      for (u32 I = LRUHead; I != CachedBlock::InvalidEntry;
+           I = Entries[I].Next) {
         const uptr CommitBase = Entries[I].CommitBase;
         const uptr CommitSize = Entries[I].CommitSize;
         const uptr AllocPos =
@@ -347,8 +368,7 @@ template <typename Config> class MapAllocatorCache {
       }
       if (Found) {
         Entry = Entries[OptimalFitIndex];
-        Entries[OptimalFitIndex].invalidate();
-        EntriesCount--;
+        remove(OptimalFitIndex);
         SuccessfulRetrieves++;
       }
     }
@@ -410,7 +430,7 @@ template <typename Config> class MapAllocatorCache {
 
   void disableMemoryTagging() EXCLUDES(Mutex) {
     ScopedLock L(Mutex);
-    for (u32 I = 0; I != Config::getQuarantineSize(); ++I) {
+    for (u32 I = 0; I != Config::getQuarantineSize(); I++) {
       if (Quarantine[I].isValid()) {
         MemMapT &MemMap = Quarantine[I].MemMap;
         MemMap.unmap(MemMap.getBase(), MemMap.getCapacity());
@@ -418,11 +438,9 @@ template <typename Config> class MapAllocatorCache {
       }
     }
     const u32 MaxCount = atomic_load_relaxed(&MaxEntriesCount);
-    for (u32 I = 0; I < MaxCount; I++) {
-      if (Entries[I].isValid()) {
-        Entries[I].MemMap.setMemoryPermission(Entries[I].CommitBase,
-                                              Entries[I].CommitSize, 0);
-      }
+    for (u32 I = LRUHead; I != CachedBlock::InvalidEntry; I = Entries[I].Next) {
+      Entries[I].MemMap.setMemoryPermission(Entries[I].CommitBase,
+                                            Entries[I].CommitSize, 0);
     }
     QuarantinePos = -1U;
   }
@@ -434,6 +452,62 @@ template <typename Config> class MapAllocatorCache {
   void unmapTestOnly() { empty(); }
 
 private:
+  void insert(const CachedBlock &Entry) REQUIRES(Mutex) {
+    DCHECK_LT(EntriesCount, atomic_load_relaxed(&MaxEntriesCount));
+
+    // Cache should be populated with valid entries when not empty
+    DCHECK_NE(AvailableHead, CachedBlock::InvalidEntry);
+
+    u32 FreeIndex = AvailableHead;
+    AvailableHead = Entries[AvailableHead].Next;
+
+    if (EntriesCount == 0) {
+      LRUTail = static_cast<u16>(FreeIndex);
+    } else {
+      // Check list order
+      if (EntriesCount > 1)
+        DCHECK_GE(Entries[LRUHead].Time, Entries[Entries[LRUHead].Next].Time);
+      Entries[LRUHead].Prev = static_cast<u16>(FreeIndex);
+    }
+
+    Entries[FreeIndex] = Entry;
+    Entries[FreeIndex].Next = LRUHead;
+    Entries[FreeIndex].Prev = CachedBlock::InvalidEntry;
+    LRUHead = static_cast<u16>(FreeIndex);
+    EntriesCount++;
+
+    // Availability stack should not have available entries when all entries
+    // are in use
+    if (EntriesCount == Config::getEntriesArraySize())
+      DCHECK(AvailableHead == CachedBlock::InvalidEntry);
+  }
+
+  void remove(uptr I) REQUIRES(Mutex) {
+    DCHECK(Entries[I].isValid());
+
+    Entries[I].invalidate();
+
+    if (I == LRUHead)
+      LRUHead = Entries[I].Next;
+    else
+      Entries[Entries[I].Prev].Next = Entries[I].Next;
+
+    if (I == LRUTail)
+      LRUTail = Entries[I].Prev;
+    else
+      Entries[Entries[I].Next].Prev = Entries[I].Prev;
+
+    Entries[I].Next = AvailableHead;
+    AvailableHead = static_cast<u16>(I);
+    EntriesCount--;
+
+    // Cache should not have valid entries when not empty
+    if (EntriesCount == 0) {
+      DCHECK(LRUHead == CachedBlock::InvalidEntry);
+      DCHECK(LRUTail == CachedBlock::InvalidEntry);
+    }
+  }
+
   void empty() {
     MemMapT MapInfo[Config::getEntriesArraySize()];
     uptr N = 0;
@@ -447,7 +521,6 @@ template <typename Config> class MapAllocatorCache {
         N++;
       }
       EntriesCount = 0;
-      IsFullEvents = 0;
     }
     for (uptr I = 0; I < N; I++) {
       MemMapT &MemMap = MapInfo[I];
@@ -484,7 +557,6 @@ template <typename Config> class MapAllocatorCache {
   atomic_u32 MaxEntriesCount = {};
   atomic_uptr MaxEntrySize = {};
   u64 OldestTime GUARDED_BY(Mutex) = 0;
-  u32 IsFullEvents GUARDED_BY(Mutex) = 0;
   atomic_s32 ReleaseToOsIntervalMs = {};
   u32 CallsToRetrieve GUARDED_BY(Mutex) = 0;
   u32 SuccessfulRetrieves GUARDED_BY(Mutex) = 0;
@@ -492,6 +564,13 @@ template <typename Config> class MapAllocatorCache {
   CachedBlock Entries[Config::getEntriesArraySize()] GUARDED_BY(Mutex) = {};
   NonZeroLengthArray<CachedBlock, Config::getQuarantineSize()>
       Quarantine GUARDED_BY(Mutex) = {};
+
+  // The LRUHead of the cache is the most recently used cache entry
+  // The LRUTail of the cache is the least recently used cache entry
+  // The AvailableHead is the top of the stack of available entries
+  u16 LRUHead GUARDED_BY(Mutex) = 0;
+  u16 LRUTail GUARDED_BY(Mutex) = 0;
+  u16 AvailableHead GUARDED_BY(Mutex) = 0;
 };
 
 template <typename Config> class MapAllocator {

compiler-rt/lib/scudo/standalone/secondary.h Outdated Show resolved Hide resolved
compiler-rt/lib/scudo/standalone/secondary.h Outdated Show resolved Hide resolved
compiler-rt/lib/scudo/standalone/secondary.h Outdated Show resolved Hide resolved
compiler-rt/lib/scudo/standalone/secondary.h Outdated Show resolved Hide resolved
compiler-rt/lib/scudo/standalone/secondary.h Outdated Show resolved Hide resolved
compiler-rt/lib/scudo/standalone/secondary.h Outdated Show resolved Hide resolved
compiler-rt/lib/scudo/standalone/secondary.h Outdated Show resolved Hide resolved
@JoshuaMBa JoshuaMBa force-pushed the LRU_algorithm_impl branch 3 times, most recently from 56cd664 to 3ba1756 Compare July 22, 2024 07:34
The logic for emptying the cache now follows an LRU eviction policy.
When the cache is full on any given free operation, the oldest entry
in the cache is evicted, and the memory associated with that cache
entry is unmapped.

Finding empty cache entries is now a constant operation with the use
of a stack of available cache entries.

Through the LRU structure, the cache retrieval algorithm now only iterates
through valid entries of the cache. Furthermore, the retrieval algorithm will
first search cache entries that have not been decommitted (i.e. madvise() has
not been called on their corresponding memory chunks) to reduce the likelihood of
returning a memory chunk to the user that would induce a page fault.
Copy link
Contributor

@cferris1000 cferris1000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@ChiaHungDuan ChiaHungDuan merged commit 95ea37c into llvm:main Jul 23, 2024
6 checks passed
thurstond added a commit to thurstond/llvm-project that referenced this pull request Jul 23, 2024
Fixes "error: unused variable 'MaxCount' [-Werror,-Wunused-variable]"
which is no longer used after llvm#99409
thurstond added a commit that referenced this pull request Jul 23, 2024
Fixes "error: unused variable 'MaxCount' [-Werror,-Wunused-variable]",
which is no longer used after
#99409
jrguzman-ms pushed a commit to msft-mirror-aosp/platform.external.scudo that referenced this pull request Jul 24, 2024
Fixes "error: unused variable 'MaxCount' [-Werror,-Wunused-variable]",
which is no longer used after
llvm/llvm-project#99409

GitOrigin-RevId: ce811fb6d94e1d4af1fd1f52fbf109bc34834970
Change-Id: I8b7cafb6d7be6f79a94e233fcefa132623dbb421
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
The logic for emptying the cache now follows an LRU eviction policy.
When the cache is full on any given free operation, the oldest entry in
the cache is evicted, and the memory associated with that cache entry is
unmapped.

Finding empty cache entries is now a constant operation with the use of
a stack of available cache entries.

Through the LRU structure, the cache retrieval algorithm now only
iterates through valid entries of the cache. Furthermore, the retrieval
algorithm will first search cache entries that have not been decommitted
(i.e. madvise() has not been called on their corresponding memory
chunks) to reduce the likelihood of returning a memory chunk to the user
that would induce a page fault.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251270
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
Fixes "error: unused variable 'MaxCount' [-Werror,-Wunused-variable]",
which is no longer used after
#99409

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251627
@JoshuaMBa JoshuaMBa deleted the LRU_algorithm_impl branch August 6, 2024 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants