You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.
Describe the bug
There is about 1M ledgers in per entry log. After running for a while, OOM will appear. And there is still enough memory.
There are two OOM positions, as follows:
Position 1:
2021-09-21 02:22:08,323 [SyncThread-7-1] ERROR org.apache.bookkeeper.bookie.SyncThread - Exception in SyncThread
java.lang.OutOfMemoryError: Java heap space
at org.apache.bookkeeper.bookie.storage.ldb.WriteCache.forEach(WriteCache.java:222) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]
Position 2:
2021-09-21 02:24:14,987 [SyncThread-7-1] ERROR org.apache.bookkeeper.bookie.SyncThread - Exception in SyncThread
java.lang.OutOfMemoryError: Java heap space
at org.apache.bookkeeper.util.collections.ConcurrentLongLongHashMap$Section.rehash(ConcurrentLongLongHashMap.java:673) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]
at org.apache.bookkeeper.util.collections.ConcurrentLongLongHashMap$Section.addAndGet(ConcurrentLongLongHashMap.java:456) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]
at org.apache.bookkeeper.util.collections.ConcurrentLongLongHashMap.addAndGet(ConcurrentLongLongHashMap.java:186) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]
at org.apache.bookkeeper.bookie.EntryLogMetadata.addLedgerSize(EntryLogMetadata.java:47) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]
The common feature is that they are allocting a humongous contiguous memory.
Position 1:
In WriteCache.forEach, about 1M entry per minute, the sortedEntries size shoud be 1M42*8=64M byte.
Position 2:
As 1M ledgers per entry log, the table size of ConcurrentLongLongHashMap should be 1M22*8=32M byte. Some times, ledger will be more than 1 M, so the memory shoud be larger than 32M.
As use G1 and the G1HeapRegionSize is 32m (the max value), there maybe no contiguous regions to allocate the humongous contiguous memory. Pre-allocated a large memory for the sortedEntries and add concurrencyLevel for the ConcurrentLongLongHashMap of EntryLogMetadata. The issue do not appear again. How about add 2 config for these?
As the meta data of all entry log is in the memory, the memory it occupied is very large. As 32MB EntryLogMetadata per entry log, the memroy shoul be serveral GB if there are hundreds of entry logs. I delete the ledgers by time. It will be delete after expire. If the entry log is not expire, it will not be delete. So the meta data is not need to load to memory. How about add a feature like this?
The text was updated successfully, but these errors were encountered:
Original Issue: apache#2806
BUG REPORT
Describe the bug
There is about 1M ledgers in per entry log. After running for a while, OOM will appear. And there is still enough memory.
There are two OOM positions, as follows:
Position 1:
2021-09-21 02:22:08,323 [SyncThread-7-1] ERROR org.apache.bookkeeper.bookie.SyncThread - Exception in SyncThread
java.lang.OutOfMemoryError: Java heap space
at org.apache.bookkeeper.bookie.storage.ldb.WriteCache.forEach(WriteCache.java:222) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]
Position 2:
2021-09-21 02:24:14,987 [SyncThread-7-1] ERROR org.apache.bookkeeper.bookie.SyncThread - Exception in SyncThread
java.lang.OutOfMemoryError: Java heap space
at org.apache.bookkeeper.util.collections.ConcurrentLongLongHashMap$Section.rehash(ConcurrentLongLongHashMap.java:673) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]
at org.apache.bookkeeper.util.collections.ConcurrentLongLongHashMap$Section.addAndGet(ConcurrentLongLongHashMap.java:456) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]
at org.apache.bookkeeper.util.collections.ConcurrentLongLongHashMap.addAndGet(ConcurrentLongLongHashMap.java:186) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]
at org.apache.bookkeeper.bookie.EntryLogMetadata.addLedgerSize(EntryLogMetadata.java:47) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]
main GC log:
2021-09-21T02:22:08.437+0800: 105453.194: [GC pause (G1 Humongous Allocation)
2021-09-21T02:22:08.449+0800: 105453.206: [Full GC (Allocation Failure) 10G->10G(20G), 1.9652874 secs]
[Eden: 0.0B(992.0M)->0.0B(1024.0M) Survivors: 32.0M->0.0B Heap: 10.3G(20.0G)->10.3G(20.0G)], [Metaspace: 35551K->35539K(1081344K)]
[Times: user=4.94 sys=0.00, real=1.96 secs]
2021-09-21T02:22:10.415+0800: 105455.172: [Full GC (Allocation Failure) 10G->10G(20G), 1.6151095 secs]
[Eden: 0.0B(1024.0M)->0.0B(1024.0M) Survivors: 0.0B->0.0B Heap: 10.3G(20.0G)->10.3G(20.0G)], [Metaspace: 35539K->35539K(1081344K)]
[Times: user=4.32 sys=0.00, real=1.62 secs]
The common feature is that they are allocting a humongous contiguous memory.
Position 1:
In WriteCache.forEach, about 1M entry per minute, the sortedEntries size shoud be 1M42*8=64M byte.
Position 2:
As 1M ledgers per entry log, the table size of ConcurrentLongLongHashMap should be 1M22*8=32M byte. Some times, ledger will be more than 1 M, so the memory shoud be larger than 32M.
As use G1 and the G1HeapRegionSize is 32m (the max value), there maybe no contiguous regions to allocate the humongous contiguous memory. Pre-allocated a large memory for the sortedEntries and add concurrencyLevel for the ConcurrentLongLongHashMap of EntryLogMetadata. The issue do not appear again. How about add 2 config for these?
As the meta data of all entry log is in the memory, the memory it occupied is very large. As 32MB EntryLogMetadata per entry log, the memroy shoul be serveral GB if there are hundreds of entry logs. I delete the ledgers by time. It will be delete after expire. If the entry log is not expire, it will not be delete. So the meta data is not need to load to memory. How about add a feature like this?
The text was updated successfully, but these errors were encountered: