-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GR-18214] Compacting garbage collection (non-default). #8870
Conversation
@peter-hofer , I found the master thesis^0 for this commit, but the measurements are quite thin. Did you run internal measurements? ^0 https://epub.jku.at/obvulihs/download/pdf/9649056?originalFilename=true |
@SergejIsbrecht, yes, and the gist of it is that it is somewhat comparable to the current copying GC, sometimes better and sometimes worse, both in terms of CPU usage and memory usage. We avoided adding a forwarding pointer to every object and instead store the new object locations in the gaps between sequences of live objects, which requires extra passes over the heap and more complex lookups for determining an object's new location and updating references to it. This in turn can make full collections more expensive, which can cause the GC policy to use more memory to do fewer collections. On the other hand, you get an (almost) hard heap size limit by collecting in-place. This initial implementation also isn't heavily tuned, so I believe there's still some low-hanging fruits to harvest. |
3426504
to
07212ef
Compare
@peter-hofer Thanks for this work and the presentation last week during the native image meeting. I'm giving this a go and wondering that the following native image build line should indicate whether mark-copy or mark-compact is in use:
As it stands, there's no indication of which old gen algorithm is in use. |
@galderz, thanks for giving it a try. In this branch, the |
@peter-hofer Thanks. I would also add some information to the log so that when Related to this, if you are tweaking the GC policy, the one used at runtime is only So, when tackling the old gen configuration info reporting at runtime, you could also tackle the gc policy configuration reporting. |
To be more precise about GC policy, I'm tackling about this messages:
These only appear when |
@peter-hofer I ran some quick tests locally with a basic Quarkus app and didn't see any major failures with it. Throughput seems to be about the same than the copying one, but latency at high percentiles shows ~10% worse. This can be understandable since as you said, the impl has not yet been optimized. I will be exploring additional angles in an upcoming blog post. |
c5bc94e
to
63dd758
Compare
2d89f1a
to
d00bd3e
Compare
14e9d0e
to
cb1baa2
Compare
782474a
to
d6c86f0
Compare
d6c86f0
to
42cd023
Compare
This PR adds a mark&compact GC for the old generation of the Serial GC. The primary intention is to reduce worst-case memory usage compared to the copying GC, which can use 2x the current heap size when all objects survive.
Typical mark&compact collectors do four passes over the heap to:
New locations are often stored in each individual object. This would significantly enlarge our current 4/8-byte object headers and add memory overhead even outside of GC. Using side tables would require this memory only during GC (and could also collect with fewer passes), but it requires allocating these tables precisely when memory can be scarce.
Our implementation uses the object header as it is and stores new locations of contiguous sequences of surviving objects in records in the gaps between them (made up of dead objects). In order to find an object's new location when updating a reference to it, we first identify the referenced object's aligned chunk. In the chunk's card table, we temporarily keep an index, which we use to find a record near the object. The records also form a singly-linked list, which we can follow further to find the exact record that applies to the object. With this record, we can then compute the object's new location and update references accordingly.
When encountering a chunk with pinned objects, we sweep the gaps in them instead. Unfortunately, we cannot fill them with other surviving objects because our design requires that the order of objects stays the same (or records would be overwritten prematurely). Currently we also copy only entire object sequences and not split them to fit smaller areas of unused memory at the end of chunks (but does not seem to be an issue in practice). We also still copy objects which have an identity hash code that is based on their current address and needs to be stored in an additional field, enlarging the objects.
The performance of the mark&compact GC is currently somewhat comparable to the current copying GC, sometimes better and sometimes worse, both in terms of CPU usage and memory usage. Depending on the application's heap usage, complete collections can become more expensive, which can cause the GC policy to use more memory to need fewer collections.
The initial implementation also isn't heavily tuned, so there should still be room for improvement. For example, the object order is currently determined by the copying collector in the young generation which often places related objects apart from each other. The GC policy can likely also be tweaked for the different characteristics.
Points of interest in the implementation:
CompactingOldGen
inSerialGCOptions
CompactingOldGeneration
of the now abstractOldGeneration
classcom.oracle.svm.core.genscavenge.compacting
ObjectHeaderImpl
SerialGCOptions.useCompactingOldGen()
in several placesThis work is based on a prototype by JKU student Christian Aistleitner for his master's thesis.