Removes the binary reader's SymbolToken cache, which provided a minor optimization for certain cases but a major regression for others. #756
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes:
The
symbolTokensById
cache was added to avoid repeatedly allocatingSymbolToken
wrappers when symbol lookup APIs that returnSymbolToken
were called. This provided a minor (<10%) speedup due to reduced allocation rate / GC churn in cases where suchSymbolToken
APIs were called frequently. However, it should be very uncommon for this to be the case, as users tend to use the equivalent APIs that returnString
instead, as most do not need to deal with symbols with unknown text.There is some usage of these APIs within the IonJava DOM, but when using the DOM, other sources of slowdown tend to dominate. There is, however, a case where this cache can significantly hurt performance within the DOM: when a small binary Ion value is loaded into the DOM from a stream that imports a large shared symbol table. In that case, the cache is allocated to fit all symbols, even the imported ones. This can cause allocation of this large list, and the resulting GC churn, to dominate.
Because the cache only has minor benefits to an uncommon case, I propose simply removing it. If in the future we find the need to optimize this uncommon case, we can consider more robust options, like:
Note: this cache was introduced in the incremental reader implementation made opt-in in v1.9.0, and was ultimately adopted by the unified implementation introduced in 1.11.0.
Performance comparisons:
To demonstrate the regression, I ran the following data through
ion-java-benchmark-cli
:In a separate PR, we will add this data and the following benchmark CLI invocation to our regression detection workflow.
v1.10.5
v1.11.3
v1.11.4-SNAPSHOT (with proposed fix)
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.