-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TreeInterpreter
creates reference cycle, causing GC pressure
#291
Comments
@mrry we at JMESPath Community are currently working on improving the standard and extending the feature set from JMESPath. We have a set of useful improvements as our next milestone pretty much nailed down. Our next step is to implement those specifications in the Python library. As a matter of fact, I'm currently contributing | to | this repository. Once done, we will either have those PRs accepted here or, most likely, we will implement them in our own fork. Then we will proceed and submit PRs for inclusion in the AWS CLI and Azure CLI repositories. That's why I would be very interested in a fix as you mention here. |
Confirmed. In addition to the GC pressure, it's also the source of a large
Each option has its own tradeoffs to consider:
Other thing to consider was that At any rate, should be able to address this, just need to work through the |
We recently noticed that a heavy JMESpath workload was triggering a large number of garbage collection runs. We are using
jmespath.compile()
, and we tracked this down to thejmespath.visitor.TreeInterpreter
that is created on every call to `ParsedResult.search():jmespath.py/jmespath/parser.py
Line 508 in bbe7300
It appears that
TreeInterpreter
creates a reference cycle, which leads to the GC being triggered frequently to clean up the cycles. As far as I can tell, the problem comes from theVisitor._method_cache
:jmespath.py/jmespath/visitor.py
Lines 91 to 93 in bbe7300
...which store references to methods that are bound to
self
in a member ofself
.Possible solution
We worked around the problem by monkey patching
ParsedResult
so that it (1) caches adefault_interpreter
for use whenoptions=None
, and (2) uses it insearch()
. If I understand correctly, we could go further and use a globalTreeInterpreter
for allParsedResult
instances. TheTreeInterpreter
seems to be stateless apart fromself._method_cache
and that implementation seems to be thread-safe (with only the risk of multiple lookups for the same method in a multithreaded case).I'd be happy to contribute a PR for either version if this would be welcome.
How to reproduce
The following reproducer shows the problem:
...where the output contains one million repetitions of something like:
The text was updated successfully, but these errors were encountered: