-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to use jemalloc as memory allocator for supported platforms #9496
Comments
It sounds like you're proposing to include jemalloc (native code) into this repo (OpenSearch core), and build and distribute it as part of the -min distribution because it will significantly improve performance of any plugin that uses native code. The obvious advantage of doing that is that users don't need to build anything to enable jemalloc, an easy performance win. Because jemalloc is not known to have (any?) downsides, this could even be the default. The alternative is to either consume a platform-specific jemalloc package, or a package with builds for multiple platforms in the distribution with plugins, similar to how we build/package native code as part of k-nn (e.g. FAISS). We've had some debates about native code in core in #9422, so I would take a look. I'm curoious what @nknize and @andrross think about this one? |
The debate about native code in #9422 was really about interfacing with native code from Java in OpenSearch core. There are portability concerns, as well as operational concerns about integrating with a library that uses memory outside the control of the JVM. Neither of these apply the same way for |
Another thought, if we are going to recommend using jemalloc we should be running performance tests with it. @VijayanB I would start there, opened opensearch-project/opensearch-benchmark#370. |
One of the interesting features jemalloc comes with is native memory profiling (with addition to JVM native memory tracking), for the reference https://poonamparhar.github.io/troubleshooting_native_memory_leaks/ |
From my experience, jemalloc has worked very well. From what Ive seen, it greatly minimizes fragmentation when compared to glibc malloc on Linux (ran some experiments a long time ago here: opensearch-project/k-NN#72 (comment)). Additionally, I have seen very few negative reviews around it. @andrross concerns make sense. I have seen some comments that it may have some platform compatibility issues, which would be difficult to maintain. I believe Rust removed jemalloc by default for mainly this reason rust-lang/rust#36963. I am wondering if we could recommend it as an optimization as opposed to bundling. |
I think some benchmarks + a blog post would make for good reading. |
Hi all, are there any documentation requirements coming for 2.12? Thanks! |
Is your feature request related to a problem? Please describe.
Since OpenSearch KNN Plugin's native engine libraries are written in C++, they can suffer from both internal and external fragmentation by default memory locator(glibc). By adding support (preferably bundle jemalloc with OpenSearch similar to jdk ) to use jemalloc in supported platforms, knn query performance can be improved.
From multiple experiments it is clear that memory allocator can affect the query performance of database engines. jemalloc is a general purpose malloc implementation that emphasizes fragmentation avoidance and scalable concurrency support. It is also used as default memory allocator for applications like Redis, Facebook, Cassandra, Firefox, etc.
Describe the solution you'd like
In supported platforms, jemalloc can be bundled with OpenSearch and used as memory allocator at runtime by updating LD_PRELOAD. This doesn't require any code changes in opensearch knn plugin.
Describe alternatives you've considered
N/A
Additional context
OpenSearch knn team conducted experiments and verified that using jemalloc as memory allocator reduces memory fragmentation and improves query performance . Please see more details here
The text was updated successfully, but these errors were encountered: