Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cursor prop cache v2 #225

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

BrunoMSantos
Copy link
Collaborator

Here is the new cache with the functools decorator. As expected, that pitfall was actually the feature I wanted all along. 2 equal DocCursor instances are able to reuse the same values, which is a win when iterating over children / parents when traversing the AST, possibly multiple times.

From limited testing and looking at the code, I have the impression this currently benefits C++ code more than C (greater need to look into parents and children of any given cursor, constantly hitting the same types and so on). With instrumentation of the code, I saw some methods save 50+% of all calls when running the test suite, but I need to look at it better. This is meant only for show and tell at this point.

Sadly this did not improve performance (still over the test suite) in any measurable way. If anything it makes it ever so slightly worse (some <0.1s per run on average on my machine).

Some of the cached methods were never triggered twice for the same cursor within our code, so I tried enabling cache only for those that showed promising numbers. But no measurable difference there, so this branch is enabling cache for all methods that are called from more than one place in our code and / or are directly exposed through the cursor's API (with one exception as noted in the relevant commit).

Performance wise this doesn't worry me, I think it's worth it for future proofing all sorts of use cases within events and so on. But not sure all methods deserve an equally large maximum cache size or what the defaults should be. Also don't know hoe to best expose tuning and profiling values to the user.

@stephanlachnit
Copy link
Contributor

Very cool! One thing that I wondering: if we parse the same header file twice in sphinx (e.g. through by directly referencing two different symbols from the same file), does this caching also apply? I think probably not? That would be something that could considerably speed up the build in certain scenarios.

@BrunoMSantos
Copy link
Collaborator Author

I believe it will reuse it as long as it's in the context of a single import and as long as Clang produces the cursor hashes deterministically. I believe it's a yes on the 1st, and I strongly suspect it's a yes on the second one too, but I didn't test it out.

Though the bigger issue with caching in that case is the caching of the clang parsing stage, which is still not a fully global cache yet. You'll find some references to that if you look around in the issues / old PRs.

@stephanlachnit
Copy link
Contributor

I believe it will reuse it as long as it's in the context of a single import and as long as Clang produces the cursor hashes deterministically. I believe it's a yes on the 1st, and I strongly suspect it's a yes on the second one too, but I didn't test it out.

Cool. I will test it out on my code and post some results here :)

Though the bigger issue with caching in that case is the caching of the clang parsing stage, which is still not a fully global cache yet. You'll find some references to that if you look around in the issues / old PRs.

Ah, you probably mean #75?

@stephanlachnit
Copy link
Contributor

Hm, doesn't seem to be faster:

Without cache:

real    0m16,047s
user    0m15,493s
sys     0m0,483s

With cache:

real    0m16,607s
user    0m15,640s
sys     0m0,878s

I'm reading the 5 different files roughly 2 times on average.

@BrunoMSantos
Copy link
Collaborator Author

Ah, you probably mean #75?

Yup, but there's also #157 and #171 at least ;)

Hm, doesn't seem to be faster:

I wouldn't expect otherwise. It takes a lot of compounding to make it noticeable, and your use case seems too small. Can you artificially create a test case in which you go over the same file like a 100 times? Otherwise you're getting the overhead and none of the benefits.

Ideally, the overhead won't matter in small cases, but the benefits will kick in in force for the big cases, which would make this viable. And there are of course other scenarios in which it might matter: we plan on exposing the cursor to the user through extensions for instance, allowing the user to exponentially grow the number of calls to these methods.

Some of the methods within the cursors are expensive, so caching them
can have a compounding effect on performance as they are called more
often.

Currently, this patch actually worsens performance a negligible amount
over our unit tests, but in more complicated code bases there's higher
potential for cursors to be revisited often. Even more so when we expose
them to user land over which we have no control naturally.

In order for this to work, we need to make sure that the `__eq__` method
is defined on top of the existing `__hash__`. Together, these guarantee
that we can construct 2 different `DocCursors` from the same arguments
and get to prime the cache from one of them and reuse it in the other
one.

The (maximum) cache size is also made configurable through an
environment variable as the most elegant way of injecting the constant
in the module before it gets evaluated. This may later be exposed to the
user in a configuration variable. 2 functions are also created to dump
the caches' status and wipe them respectively. These too can later be
used to inform the user on how to tune the cache size.

Here we set up caching for all methods that are either callable through
multiple paths or directly exposed through the API, with the exception
of `_specifiers_fixup`. This is to do with the limitation all the method
arguments are hashable, together with the `DocCursor` class itself. As
it turns out, Clang's types are not always hashable, so using a bare
`Type` object as a parameters prevents caching. There are ways around
it, sure, but in this case there's no significant performance impact and
it's not a path that might get exposed to the user as it's only called
through internal methods that can themselves be cached.
This may be helpful in tuning the cache size.
Don't really know how to do this better at this point, but it allows
some crude testing for now.
@BrunoMSantos BrunoMSantos force-pushed the cursor-prop-cache-v2 branch 2 times, most recently from 86dcba4 to 24c4c78 Compare May 3, 2024 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants