-
Notifications
You must be signed in to change notification settings - Fork 9.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup CI setup to <20s #30706
Comments
wild idea, compatibility unknown, gain (or loss) unknown: use containerd or similar drop-in runtime instead of stock GHA moby with lazy loading of compatible estargz docker image "pages". |
@lukechilds what do you think? |
what are the cons here? |
That sleep 30 seems questionably long |
If it's just loading previously configured env, then why not just operate on Flash Drive, leaving the state where it was when last active. Even on my own computer, I often keep copies of working system/environments that I simply dump into active memory without "booting up". It saves SO MUCH time when you already know the final state anyway. --Loren Grayson |
Throwing some thoughts down: So taking an example from a recent run, it looks like the What you can see when you do that is that there is one long-pole layer which is the installation of python dependencies. In the Dockerfile that's done here. There is also another large layer to install the ubuntu dependencies but this is not the bottleneck (at the moment). Using dive, we can see the sizes of the different layers as well which confirms this python dependency layer is the big boy. So what can we do about this? There are likely more ways, but I can think of two ways to go about addressing this:
Note that with either method, you may have to repeat the exercise for the ubuntu dependencies as well given the size of that layer is on the same order of magnitude as the python dependency layer. Both of these methods do continue to rely on docker, with option 1 in some ways doubling down on it. I personally do not think docker overhead is really the issue at hand here and believe there are likely benefits to continue using containers for portability. To me this ultimately seems like a problem of having a large amount of dependency bits and finding the fastest way to move them onto a clean github worker. Docker makes some of this more challenging (the layer concurrency piece) but doesnt completely block a speedy build. In some ways, it might make things easier. One final way to go about this would be to trim the dependency fat and hope that slims the layers enough to download in a reasonable amount of time. There is no telling if that would be enough, and furthermore, once you do trim, it is a cat-and-mouse game since dependencies will likely be added in the future. |
Is this bounty still open @adeebshihadeh? I see setup-with-retry is running in < 20 seconds in a lot of the CI runs (ex: https://github.com/commaai/openpilot/actions/runs/9432165207/job/25981593463). Also, looking at the latest code on master it seems like we've ultimately decided to use a self-hosted runner (which previously wasn't considered a viable solution for this bounty)? |
Still open. We're now using namespace runners for internal branches, but I'd love to move back to the GitHub-hosted runners at some point. |
Is the issue open? |
it looks open to me @ADITYA1720 |
Is this issue open, please?? |
@ADITYA1720 @jimbrend @naaa760 for those asking, if an issue has it marked as It seems like the contributing guidelines doesn't include this and it also likes like the |
thank you @BBBmau |
@adeebshihadeh Just to clarify, I can disregard the namespace runners if I get it working in github actions fully in under 20s? |
I have put up a WIP PR with my work at #33831 if it is possible to lock this bounty. If not I will continue to work on it regardless. edit: I closed the PR so I don't trigger your github actions while testing on my fork. |
correct |
Is data transfer rate limit from Docker's side contributing to slowness? |
The best case time of the
setup-with-retry
stage that runs in most of our CI jobs is ~1m4s. All it does is setup the openpilot environment, and most of that time is pulling an already built docker image. This puts a hard limit on how fast our jobs can finish; a job that finishes in 1m is 10x better than one that finishes in 2-3m.Some possible strategies:
Requirements for the bounty:
setup-with-retry
on the final PR commit must finish in less than <20sSub-bounty of $100 for <40s if you can't get to <20s. $500 is for <20s. Bounties don't stack.
https://github.com/commaai/openpilot/blob/master/.github/workflows/setup-with-retry/action.yaml
The text was updated successfully, but these errors were encountered: