-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symbol and page pool leak memory when used in a multi-threaded context #264
Comments
Something we should consider strongly here is eliminating the page pool entirely--and I do mean delete the code and make sure it cannot be used. The rationale is this--we keep putting mitigations into this code that was never designed to be thread-safe, the use of TLS here was mostly to work around the fact that we couldn't get at all of the places where the page pool allocates even with limit of zero. In my opinion, instead of retrofitting this code which has been proven to be very prone to bugs, we should consider a re-write in which the pool is explicit in the API (optional and default to |
The page pool allocator leaks memory when used in a threaded context (amazon-ion#264). While we could invest in making this thread-safe by using proper concurrency primitives, it's not clear that the page pool allocator is actually providing any benefit right now. There are no benchmarks to indicate that it is, and any sane allocator should already be doing something like this behind the scenes (in a thread-safe way to boot). This nukes the page pool code. We can perform some perf tests and benchmarks to determine if it's really worth it to reintroduce something like it in the future.
The page pool allocator leaks memory when used in a threaded context (amazon-ion#264). While we could invest in making this thread-safe by using proper concurrency primitives, it's not clear that the page pool allocator is actually providing any benefit right now. There are no benchmarks to indicate that it is, and any sane allocator should already be doing something like this behind the scenes (in a thread-safe way to boot). This nukes the page pool code. We can perform some perf tests and benchmarks to determine if it's really worth it to reintroduce something like it in the future.
The page pool allocator leaks memory when used in a threaded context (amazon-ion#264). While we could invest in making this thread-safe by using proper concurrency primitives, it's not clear that the page pool allocator is actually providing any benefit right now. There are no benchmarks to indicate that it is, and any sane allocator should already be doing something like this behind the scenes (in a thread-safe way to boot). This nukes the page pool code. We can perform some perf tests and benchmarks to determine if it's really worth it to reintroduce something like it in the future.
The page pool allocator leaks memory when used in a threaded context (amazon-ion#264). While we could invest in making this thread-safe by using proper concurrency primitives, it's not clear that the page pool allocator is actually providing any benefit right now. There are no benchmarks to indicate that it is, and any sane allocator should already be doing something like this behind the scenes (in a thread-safe way to boot). This nukes the page pool code. We can perform some perf tests and benchmarks to determine if it's really worth it to reintroduce something like it in the future.
The page pool allocator leaks memory when used in a threaded context (amazon-ion#264). While we could invest in making this thread-safe by using proper concurrency primitives, it's not clear that the page pool allocator is actually providing any benefit right now. There are no benchmarks to indicate that it is, and any sane allocator should already be doing something like this behind the scenes (in a thread-safe way to boot). This nukes the page pool code. We can perform some perf tests and benchmarks to determine if it's really worth it to reintroduce something like it in the future.
The page pool allocator leaks memory when used in a threaded context (amazon-ion#264). While we could invest in making this thread-safe by using proper concurrency primitives, it's not clear that the page pool allocator is actually providing any benefit right now. There are no benchmarks to indicate that it is, and any sane allocator should already be doing something like this behind the scenes (in a thread-safe way to boot). This nukes the page pool code. We can perform some perf tests and benchmarks to determine if it's really worth it to reintroduce something like it in the future.
The page pool allocator leaks memory when used in a threaded context (#264). While we could invest in making this thread-safe by using proper concurrency primitives, it's not clear that the page pool allocator is actually providing any benefit right now. There are no benchmarks to indicate that it is, and any sane allocator should already be doing something like this behind the scenes (in a thread-safe way to boot). This nukes the page pool code. We can perform some perf tests and benchmarks to determine if it's really worth it to reintroduce something like it in the future.
ion-c
used to allocate a global symbol table that's used to pool/cache global symbols. This was later put into thread-local storage (TLS), presumably because of thread safety concerns, and this was a cheap/minimally invasive mitigation until further work could be done to make things properly thread safe. Since the symbol pool now lives in TLS, each thread gets its own (global) symbol pool. When the symbol pool is initialized, allocations for the (growable) array backing the symbol pool are handed out from the heap (i.e. not from TLS itself - which wouldn't really make sense anyways since TLS is limited in size). Since the symbol pool is treated as a global resource, there isn't any code to free this backing heap storage - it's assumed to have the same lifetime as the program, which is a fine assumption if it truly were a global resource, but isn't correct given that it is now thread local.When a thread terminates, it's TLS gets free'd, but nothing is cleaning up the heap-allocated parts of the global symbol pool (because there isn't anything that frees the heap-allocated parts of the symbol pool - it was originally implemented as a global resource). So whenever a thread dies or is reinitialized, we effectively leak its copy of the global symbol pool.
Something similar seems to be happening for the page pool, as I have observed leaks that go away when the page pool is disabled (i.e. the default page limit is set to 0) and it looks like a pointer to the page pool is also stored in TLS. Overall, about ~1.1 MB are being leaked per thread. With lots of threads or a thread pool that recycles threads and re-initializes TLS though, this adds up pretty quickly.
The text was updated successfully, but these errors were encountered: