-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On a machine with many cores, snapshotting large repos is very CPU-intensive #4508
Comments
Interesting. Can't look now, but it's possible we serialize all tree updates into a single channel and it's contending when there are more threads. It would be in the working copy snapshot code somewhere.
To fix your immediate issue, you can also try enabling the Watchman fsmonitor. |
I heard (iirc when I was working on Mercurial) that it's sometimes faster to scan directory entries sequentially than splitting jobs to worker processes, which tends to lead to random access. I don't know this is the case, though. |
Not even a little bit urgent -- I was mostly bewildered at what I could possibly have broken on the big machine to make (watchman is currently broken in MacPorts, which is how I ended up here) |
I can't reproduce this on my 32 core Zen 1 machine (Linux 6.10) with I suspect two things:
I don't have a Studio but I do have a M2 Air, which coincidentally dual boots Fedora. So, if I get a chance I can see how it all shakes out on both systems, Linux vs macOS, but it's only 4P+4E, so it's not going to be as big a deal I suspect. If it turns out that some other core configuration gives big improvements we can probably make a change to the scheduling policy somehow before we use Rayon so Note that I couldn't reliably clone
|
Yes, this is a 16P+8E Mac Studio. I noticed while testing this that OS caches seem to get evicted pretty quickly; after not that many seconds, a re-run is noticeably slower. I don't understand why, but thought it was interesting. I've not figured out how to control QoS to the degree you describe, but
|
Description
On a machine with many cores, snapshotting large repos is very CPU-intensive
Steps to Reproduce the Problem
jj st
(or justjj
) and measure the time it takes for the command to complete.export RAYON_NUM_THREADS=4
Expected Behavior
Similar performance in both cases.
Actual Behavior
jj
's default behavior:2.28 real 0.34 user 32.26 sys
jj
limited to four threads:1.13 real 0.16 user 0.49 sys
This one took some doing and profiling to figure out, as it didn't immediately make sense that the same working copy is so much faster to work with on a much smaller machine.
Specifications
The text was updated successfully, but these errors were encountered: