Replies: 4 comments 1 reply
-
In practice: artifactsLet's start by getting one practical aspect out of the way: the new API doesn't specify how to store artifacts or "files" in general. It just requires a publicly-available URL to access them. So they may be hosted as GitLab artifacts although it's not completely trivial to get the URL for them, or with a third-party storage but then it means implementing the upload / retrieving in the GitLab CI pipeline implementation itself. So a choice has to be made here, in any case I would advocate for some helpers to facilitate this. For example, a |
Beta Was this translation helpful? Give feedback.
-
In practice: meta-dataThe standard way of using the new So for a more robust and reliable pipeline, to avoid the situation where developers have perfectly valid changes that can't get merged because of some unplugged cable or something, local files is better. It's not entirely implemented with the new tooling though - having something like But then of course, sharing the results with the API server is very valuable too. It means the results can be compare pre-merge and post-merge, when the Git tree then gets tested later outside GitLab CI in linux-next or anywhere else. The results could be linked and if the revision in linux-next is failing but not the pre-merge CI pipeline then it means that it only started failing in combination with some other change - you get the idea. How much data should be kept locally or in the central API database? That's up to each pipeline implementation to decide, in a way it's another manifestation of the "continuum" principle from a local, self-contained tool to a public, shared automated service. |
Beta Was this translation helpful? Give feedback.
-
In practice: orchestrationFollowing the same principles, the GitLab CI pipeline stages can be either orchestrated by GitLab as in a regular pipeline or via the API events mechanism. When using local JSON files, I believe only the regular pipeline approach can really be used unless you start generating custom events via the API but that would seem a bit over the top. When sending all the intermediate results to the API, it's possible to get events and maybe have pipeline stages waiting for such events to be received. So all the pipeline steps could be started in parallel and GitLab wouldn't know which ones will depend on which ones to delegate all this to the API events. That would seem also a bit over the top, but there is some value in doing this when other external tools are also involved. For example, you could do kernel builds on a GitLab CI runner and then wait for some runtime results somewhere else which would get triggered following an event when the kernel build is ready, and then the pipeline would wait for the next events when the runtime test results are in. It would seem much cleaner this way than polling some particular lab infrastructure like Mesa CI does with known LAVA instances - which works very well but is not very flexible or scalable. |
Beta Was this translation helpful? Give feedback.
-
Nice overview, Guillaume! Regarding "Commit message quality", there are open issues about Ability to start discussions on commit message, Product discovery: increase commit message visibility in merge requests No recent activity on them, though. |
Beta Was this translation helpful? Give feedback.
-
The topic of using modern DevOps tools for upstream Linux kernel development has been discussed almost for as long as these tools have existed, such as GitHub, GitLab and Gerrit. Probably due to personal taste or historical reasons, GitLab CI tends to be the most popular example so I picked it for the topic of this discussion. For completeness, let's not forget that Chromium OS kernel development uses Gerrit (downstream but public) and U-Boot uses Azure Pipelines on GitHub even though the main workflow is still based on emails.
Issues with adopting GitLab CI
While the overall mindset is evolving and some subsystems such as drm already have a long history of using GitLab CI, several issues keeping being brought up every time the topic is mentioned. These are well founded and pretty well understood, only the solutions aren't trivial. It seems like it's mostly a matter of time until a DevOps system can be adopted more broadly, once the known blocking issues have been addressed. Let's take a look at what I believe to be the main ones.
Short-lived tooling
Emails, patch files and tarballs are based on established standards and are pretty much going to stay for ever, or for at least as long as there will be a Linux kernel. The Git SCM was first created especially for the kernel development workflow so it's assumed it will also always be there or that it wouldn't be abandoned without a new tool to replace it. These things are universal, portable and not owned by a particular corporation or private group.
On the other hand, GitHub is now owned by Microsoft so it may go offline or change dramatically if it was acquired again - basically there's nothing to really guarantee it's always going to stay the same. Also it requires users to create an account with this single provider. Gerrit can be self-hosted but is a centralised system and maintained by Google, pretty much like GitLab. So these are similar to Patchwork in this sense, they may be used but none of them can be enforced yet as the main tool for upstream kernel development. This is why there are workflows that run a CI "on the side", to still rely only on emails as a common denominator for code reviews and plain Git repositories for applying changes but have some amount of automated testing going on in parallel. How could this gap be closed? We'll come back to it a bit later. Let's look at the other issues first.
Commit message quality
When you receive a patch over email, you actually have to read it as an email which means it needs to be properly written and longer series often come with a cover letter. It's especially important for large and complex projects such as the Linux kernel. On systems like GitLab, there are a vast number of small projects where few people actually take a close look at the Git history. So the trend is more to rely on the web UI with review comments. Still, the changes get merged and there is value in having a self-descriptive Git history: developers can follow what the changes were about without having to dig out a closed merge request etc.
It's of course possible to try and enforce a higher standard for commit messages on GitLab but there's no way to actually put review comments on the commit messages themselves. Also they can be easily hidden behind the merge request description etc. from the web UI. I've hit this problem many times on GitHub too while trying to apply the same quality standard as for kernel commits to make KernelCI more kernel developer-friendly but had to resort to doing things like quoting bits from the commit message in a comment. I don't think it's very technically challenging to have this fixed in GitLab or even GitHub, rather it's a different approach to doing code reviews using the web UI as primary tool rather than the text from the patches like with emails. So that's another hurdle that gives these tools bad press in the kernel community. What else?
Products don't run upstream
There isn't a single commercial product out there running a plain mainline kernel, or even an LTS upstream kernel. So enforcing a CI to pass to get changes merged is hard to justify. If a CI check fails and it needs to pass in order to release a software update in the field for actual users, then it's pretty clear that it's a problem as nobody wants users to hit that issue. But when people know that the changes are looking fine and really want to have it merged for the next kernel release, they can easily find it frustrating to hit a red CI check and can easily argue that it's not important or could be fixed in -rc2 or it's a flake or a problem with the test or...
That's because until now, it's always been up to particular people to judge and decide what goes into the kernel and when to declare the mainline kernel ready to be tagged with a new version. And we all know that it's not guaranteed to work in any way - I like to quote Torvalds' email when v5.17 was released:
Here's a newly released kernel: go and test it! It's the polar opposite of what a CI loop is all about, i.e. testing and getting to a particular quality level before making the release in a mechanical way.
What can we do about them?
There are probably several other common persistent "problems" reported by kernel developers and maintainers but let's see how KernelCI with its new API can already address the ones described above. I'll put a comment for each idea to allow discussion in threads. I'm essentially basing this on my kernel testing continuum blog post which was based on a topic for Kernel Recipes which in turn was based on many prior discussions with the community over the years. Anyone can of course start additional ones, and I would also like to look into the practicalities of doing a proof-of-concept with GitLab CI and KernelCI /
kci
like @khilman did a while ago with the legacy system and thekci_build
andkci_test
command line tools.Beta Was this translation helpful? Give feedback.
All reactions