Cursor prop cache v2 #225

BrunoMSantos · 2023-11-19T12:16:44Z

Here is the new cache with the functools decorator. As expected, that pitfall was actually the feature I wanted all along. 2 equal DocCursor instances are able to reuse the same values, which is a win when iterating over children / parents when traversing the AST, possibly multiple times.

From limited testing and looking at the code, I have the impression this currently benefits C++ code more than C (greater need to look into parents and children of any given cursor, constantly hitting the same types and so on). With instrumentation of the code, I saw some methods save 50+% of all calls when running the test suite, but I need to look at it better. This is meant only for show and tell at this point.

Sadly this did not improve performance (still over the test suite) in any measurable way. If anything it makes it ever so slightly worse (some <0.1s per run on average on my machine).

Some of the cached methods were never triggered twice for the same cursor within our code, so I tried enabling cache only for those that showed promising numbers. But no measurable difference there, so this branch is enabling cache for all methods that are called from more than one place in our code and / or are directly exposed through the cursor's API (with one exception as noted in the relevant commit).

Performance wise this doesn't worry me, I think it's worth it for future proofing all sorts of use cases within events and so on. But not sure all methods deserve an equally large maximum cache size or what the defaults should be. Also don't know hoe to best expose tuning and profiling values to the user.

stephanlachnit · 2023-11-25T22:42:28Z

Very cool! One thing that I wondering: if we parse the same header file twice in sphinx (e.g. through by directly referencing two different symbols from the same file), does this caching also apply? I think probably not? That would be something that could considerably speed up the build in certain scenarios.

BrunoMSantos · 2023-11-25T23:29:39Z

I believe it will reuse it as long as it's in the context of a single import and as long as Clang produces the cursor hashes deterministically. I believe it's a yes on the 1st, and I strongly suspect it's a yes on the second one too, but I didn't test it out.

Though the bigger issue with caching in that case is the caching of the clang parsing stage, which is still not a fully global cache yet. You'll find some references to that if you look around in the issues / old PRs.

stephanlachnit · 2023-11-26T09:07:29Z

I believe it will reuse it as long as it's in the context of a single import and as long as Clang produces the cursor hashes deterministically. I believe it's a yes on the 1st, and I strongly suspect it's a yes on the second one too, but I didn't test it out.

Cool. I will test it out on my code and post some results here :)

Though the bigger issue with caching in that case is the caching of the clang parsing stage, which is still not a fully global cache yet. You'll find some references to that if you look around in the issues / old PRs.

Ah, you probably mean #75?

stephanlachnit · 2023-11-26T10:02:49Z

Hm, doesn't seem to be faster:

Without cache:

real    0m16,047s
user    0m15,493s
sys     0m0,483s

With cache:

real    0m16,607s
user    0m15,640s
sys     0m0,878s

I'm reading the 5 different files roughly 2 times on average.

BrunoMSantos · 2023-11-27T12:49:18Z

Ah, you probably mean #75?

Yup, but there's also #157 and #171 at least ;)

Hm, doesn't seem to be faster:

I wouldn't expect otherwise. It takes a lot of compounding to make it noticeable, and your use case seems too small. Can you artificially create a test case in which you go over the same file like a 100 times? Otherwise you're getting the overhead and none of the benefits.

Ideally, the overhead won't matter in small cases, but the benefits will kick in in force for the big cases, which would make this viable. And there are of course other scenarios in which it might matter: we plan on exposing the cursor to the user through extensions for instance, allowing the user to exponentially grow the number of calls to these methods.

Some of the methods within the cursors are expensive, so caching them can have a compounding effect on performance as they are called more often. Currently, this patch actually worsens performance a negligible amount over our unit tests, but in more complicated code bases there's higher potential for cursors to be revisited often. Even more so when we expose them to user land over which we have no control naturally. In order for this to work, we need to make sure that the `__eq__` method is defined on top of the existing `__hash__`. Together, these guarantee that we can construct 2 different `DocCursors` from the same arguments and get to prime the cache from one of them and reuse it in the other one. The (maximum) cache size is also made configurable through an environment variable as the most elegant way of injecting the constant in the module before it gets evaluated. This may later be exposed to the user in a configuration variable. 2 functions are also created to dump the caches' status and wipe them respectively. These too can later be used to inform the user on how to tune the cache size. Here we set up caching for all methods that are either callable through multiple paths or directly exposed through the API, with the exception of `_specifiers_fixup`. This is to do with the limitation all the method arguments are hashable, together with the `DocCursor` class itself. As it turns out, Clang's types are not always hashable, so using a bare `Type` object as a parameters prevents caching. There are ways around it, sure, but in this case there's no significant performance impact and it's not a path that might get exposed to the user as it's only called through internal methods that can themselves be cached.

This may be helpful in tuning the cache size.

Don't really know how to do this better at this point, but it allows some crude testing for now.

BrunoMSantos requested a review from jnikula November 19, 2023 12:16

BrunoMSantos force-pushed the cursor-prop-cache-v2 branch from c31dd04 to cfccbc1 Compare March 24, 2024 11:27

BrunoMSantos added 3 commits May 3, 2024 11:04

hawkmoth: add option to dump cursor cache info

80cb1b0

This may be helpful in tuning the cache size.

FIXME: test: show DocCursor cache stats

24c4c78

Don't really know how to do this better at this point, but it allows some crude testing for now.

BrunoMSantos force-pushed the cursor-prop-cache-v2 branch 2 times, most recently from 86dcba4 to 24c4c78 Compare May 3, 2024 10:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cursor prop cache v2 #225

Cursor prop cache v2 #225

BrunoMSantos commented Nov 19, 2023

stephanlachnit commented Nov 25, 2023

BrunoMSantos commented Nov 25, 2023

stephanlachnit commented Nov 26, 2023

stephanlachnit commented Nov 26, 2023

BrunoMSantos commented Nov 27, 2023

Cursor prop cache v2 #225

Are you sure you want to change the base?

Cursor prop cache v2 #225

Conversation

BrunoMSantos commented Nov 19, 2023

stephanlachnit commented Nov 25, 2023

BrunoMSantos commented Nov 25, 2023

stephanlachnit commented Nov 26, 2023

stephanlachnit commented Nov 26, 2023

BrunoMSantos commented Nov 27, 2023