Revert #25754 (improvements to localAccess() calls) #25936
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While #25754 did a nice job of improving the error messages for local accesses, its implementation was sufficiently heavyweight that it resulted in new timeouts in testing for tests that made a lot of use of localAccesses, primarily noticed in configurations that were already slow (valgrindexe, memleaks, baseline).
On one hand, I didn't fully appreciate how much overhead the new implementation added; but more importantly, I incorrectly thought it would only come into play rarely—for codes that explicitly used localAccess() (with checks on), which I think of as being an uncommon occurrence. What I forgot is that our auto-local-access (ALA) optimization would also cause it to fire for many codes that don't contain explicit localAccess calls. As a particularly bad example, Engin found that test/studies/hpcc/PTRANS/old/PTRANS.chpl went from 35 to 62 seconds on his laptop despite not containing any localAccess calls.
We could just turn off the new checks by default for this release, but it felt simpler/saner to me to just revert the PR for now given that it's code-freeze day and that this wasn't a release-critical change.
I think the way to improve this overhead going forward is probably to change the logic from the current approach of:
to:
Even in that case, we could improve on the current logic by doing more specialization for the (common) "locale only owns a single sublocale" case than I did in the current approach. However, I think it's arguably fine to spend more time gathering the subdomains when we're about to print an error and exit the program anyway.