Create an arena for package names #242
Open
+228
−205
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We have long known that resolution time is proportional to
P::Clone
. Most real-world resolution problems contain at least oneString
in thereP
. If performance is anywhere on the priority list thenP::Clone
should not allocate. Thanks to our library being generic this is easy for users to achieve and control.P
can be a wrapper aroundRc<P>
and it does not allocate, or using various interning strategies use&str
for even faster clone, or by using hashconsing (or any de-duplicating interning strategy)P
can be a wrapper aroundusize
. So for our benchmarks we usedP=u32
because it was the simplest type that met our trait pounds ORP=&str
because it was similar to what was easily available to a user. Now that we have production users, we see that thereP
tends to have a more expensive clone. (Rc
for the cargo benchmarks,Arc
for the uv use case,String
for gleam and elm.)It also turns out that resolution time is proportional to
P::Hash
. AsP
ends up being the key in a large number of hash tables. The hottest of these tables tends to bePartialSolution::package_assignments
. Only the hashconsing (or its relatives) strategy allows a user to provide very fastP::Hash
whereP
has a string in it. None of our production users are using this approach. So the performance of our internal benchmarks that use&str
are far more realistic than the ones that useu32
.Luckily we can provide a hashconsing like wrapper around
P
for our users. We already have an arena such that if two IDs are equal then the data they point at must be equal, we just need to make the inverse true that if the data is equal then any IDs for that data will be equal. This arena instead of allocating by just pushing new items to aVec
(and returning the new index) would return a previous ID if the value had already been added and do the normal thing otherwise. This does not involve a lot of code by using the indexmap crate, which is already one of our dependencies.This is a pessimization four benchmarks that use
u32
, or for any (theoretical) users who are already using hashconsing. Unfortunately rust does not have specialization, and even if it did there is no trait bound for "Hash
is cheap". Based on the (hilariously out of date named)large_case_u16_NumberVersion
this is a ~9.5% regression, similarly ~10.5% for the syntheticslow_135_0_u16_NumberVersion
. A sudoku problem, which is not synthetic but also not the intended use case, sees a similar regression.Real-world benchmarks see significant improvements.
elm_str_SemanticVersion
10.2%,zuse_str_SemanticVersion
19.9%, all of crates.io 14% without lock files and 11% with.I'd love to hear the impact on
uv
benchmarks, but I expect them to be in a similar range. Perhaps infinitesimally bigger because they are already collecting this data for their implementation of prioritize which can now just beId<P>::as_raw()
.There is potential follow-up work because
Map<Id<P
(especially when densely filled) can be replaced with aVec
decreasing the size and removing the calculation ofHash
forId
. But it's a lot of code for a much smaller when so I left it out for now.Unfortunately bunch of log and panic messages now refer to
Id(1)
instead of the name of that package. Similarly they use Debug instead of display. This is hard to fix because the relevantimpls
do not have access to the arena in which the actual package name is stored.