Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbol and page pool leak memory when used in a multi-threaded context #264

Open
sadderchris opened this issue Dec 7, 2021 · 1 comment · Fixed by #274
Open

Symbol and page pool leak memory when used in a multi-threaded context #264

sadderchris opened this issue Dec 7, 2021 · 1 comment · Fixed by #274

Comments

@sadderchris
Copy link
Contributor

ion-c used to allocate a global symbol table that's used to pool/cache global symbols. This was later put into thread-local storage (TLS), presumably because of thread safety concerns, and this was a cheap/minimally invasive mitigation until further work could be done to make things properly thread safe. Since the symbol pool now lives in TLS, each thread gets its own (global) symbol pool. When the symbol pool is initialized, allocations for the (growable) array backing the symbol pool are handed out from the heap (i.e. not from TLS itself - which wouldn't really make sense anyways since TLS is limited in size). Since the symbol pool is treated as a global resource, there isn't any code to free this backing heap storage - it's assumed to have the same lifetime as the program, which is a fine assumption if it truly were a global resource, but isn't correct given that it is now thread local.

When a thread terminates, it's TLS gets free'd, but nothing is cleaning up the heap-allocated parts of the global symbol pool (because there isn't anything that frees the heap-allocated parts of the symbol pool - it was originally implemented as a global resource). So whenever a thread dies or is reinitialized, we effectively leak its copy of the global symbol pool.

Something similar seems to be happening for the page pool, as I have observed leaks that go away when the page pool is disabled (i.e. the default page limit is set to 0) and it looks like a pointer to the page pool is also stored in TLS. Overall, about ~1.1 MB are being leaked per thread. With lots of threads or a thread pool that recycles threads and re-initializes TLS though, this adds up pretty quickly.

@almann
Copy link
Contributor

almann commented Dec 15, 2021

Something we should consider strongly here is eliminating the page pool entirely--and I do mean delete the code and make sure it cannot be used. The rationale is this--we keep putting mitigations into this code that was never designed to be thread-safe, the use of TLS here was mostly to work around the fact that we couldn't get at all of the places where the page pool allocates even with limit of zero.

In my opinion, instead of retrofitting this code which has been proven to be very prone to bugs, we should consider a re-write in which the pool is explicit in the API (optional and default to NULL meaning no pooling). By explicit, I mean that the pool is explicitly managed by the application and we are explicit about the thread-safety or non-safety of that pool. If the pool is explicit, this is good for the application because it can choose to or not use a concurrent safe version if we so desire. It also means that we don't run afoul of any interesting threading concerns that the host application has.

sadderchris added a commit to sadderchris/ion-c that referenced this issue Dec 22, 2021
The page pool allocator leaks memory when used in a threaded context
(amazon-ion#264). While we could invest in making this thread-safe by using proper
concurrency primitives, it's not clear that the page pool allocator
is actually providing any benefit right now.  There are no benchmarks
to indicate that it is, and any sane allocator should already be doing
something like this behind the scenes (in a thread-safe way to boot).

This nukes the page pool code.  We can perform some perf tests and
benchmarks to determine if it's really worth it to reintroduce something
like it in the future.
@almann almann reopened this Jan 19, 2022
sadderchris added a commit to sadderchris/ion-c that referenced this issue Jan 19, 2022
The page pool allocator leaks memory when used in a threaded context
(amazon-ion#264). While we could invest in making this thread-safe by using proper
concurrency primitives, it's not clear that the page pool allocator
is actually providing any benefit right now.  There are no benchmarks
to indicate that it is, and any sane allocator should already be doing
something like this behind the scenes (in a thread-safe way to boot).

This nukes the page pool code.  We can perform some perf tests and
benchmarks to determine if it's really worth it to reintroduce something
like it in the future.
sadderchris added a commit to sadderchris/ion-c that referenced this issue Feb 1, 2022
The page pool allocator leaks memory when used in a threaded context
(amazon-ion#264). While we could invest in making this thread-safe by using proper
concurrency primitives, it's not clear that the page pool allocator
is actually providing any benefit right now.  There are no benchmarks
to indicate that it is, and any sane allocator should already be doing
something like this behind the scenes (in a thread-safe way to boot).

This nukes the page pool code.  We can perform some perf tests and
benchmarks to determine if it's really worth it to reintroduce something
like it in the future.
sadderchris added a commit to sadderchris/ion-c that referenced this issue Feb 2, 2022
The page pool allocator leaks memory when used in a threaded context
(amazon-ion#264). While we could invest in making this thread-safe by using proper
concurrency primitives, it's not clear that the page pool allocator
is actually providing any benefit right now.  There are no benchmarks
to indicate that it is, and any sane allocator should already be doing
something like this behind the scenes (in a thread-safe way to boot).

This nukes the page pool code.  We can perform some perf tests and
benchmarks to determine if it's really worth it to reintroduce something
like it in the future.
sadderchris added a commit to sadderchris/ion-c that referenced this issue Feb 3, 2022
The page pool allocator leaks memory when used in a threaded context
(amazon-ion#264). While we could invest in making this thread-safe by using proper
concurrency primitives, it's not clear that the page pool allocator
is actually providing any benefit right now.  There are no benchmarks
to indicate that it is, and any sane allocator should already be doing
something like this behind the scenes (in a thread-safe way to boot).

This nukes the page pool code.  We can perform some perf tests and
benchmarks to determine if it's really worth it to reintroduce something
like it in the future.
sadderchris added a commit to sadderchris/ion-c that referenced this issue Feb 10, 2022
The page pool allocator leaks memory when used in a threaded context
(amazon-ion#264). While we could invest in making this thread-safe by using proper
concurrency primitives, it's not clear that the page pool allocator
is actually providing any benefit right now.  There are no benchmarks
to indicate that it is, and any sane allocator should already be doing
something like this behind the scenes (in a thread-safe way to boot).

This nukes the page pool code.  We can perform some perf tests and
benchmarks to determine if it's really worth it to reintroduce something
like it in the future.
tgregg pushed a commit that referenced this issue Feb 10, 2022
The page pool allocator leaks memory when used in a threaded context
(#264). While we could invest in making this thread-safe by using proper
concurrency primitives, it's not clear that the page pool allocator
is actually providing any benefit right now.  There are no benchmarks
to indicate that it is, and any sane allocator should already be doing
something like this behind the scenes (in a thread-safe way to boot).

This nukes the page pool code.  We can perform some perf tests and
benchmarks to determine if it's really worth it to reintroduce something
like it in the future.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants