Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v3.6-branch] ci: Use zephyr-runner v2 #70131

Conversation

stephanosio
Copy link
Member

@stephanosio stephanosio commented Mar 12, 2024

This series migrates the CI workflows in the v3.6-branch to use the zephyr-runner v2.


Backport of #69806, #70014 and #70015

"Backport Issue Check" failure should be overridden since this is strictly a CI housekeeping change and backport rules do not apply.

stephanosio and others added 25 commits March 13, 2024 00:52
This commit updates the twister workflow to use the new zephyr-runner v2 CI
runner deployment.

It also updates the workflow to use the `ci-repo-cache` Docker image, which
includes the Zephyr repository cache, because the node level repository
cache is no longer available in the zephyr-runner v2.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 7c19bc7)
This commit updates the twister workflow jobs that run on the zephyr-runner
v2 to print the underlying cloud service information in the logs to help
trace and debug potential runner issues.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 9a9bebb)
This commit updates the twister workflow to store ccache data in the
zephyr-runner v2 node cache.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 1be3aac)
This commit updates the twister workflow to, when available, use Redis
remote storage backend for the ccache compilation cache data.

The Redis cache server is hosted in the Kubernetes cluster in which the
zephyr-runner pods run -- the Redis remote storage backend will be ignored
if the server is unavailable.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 3823f1f)
This commit adds the compiler `--specs=*` flag to the ccache ignore option
list because ccache is unable to resolve the toolchain-provided specs file
path and will consider source files to be uncacheable if it is unable to
read the specified specs file.

Note that adding `--specs=*` to the ignore option list is not a problem
because it is unlikely for the content of the toolchain libc spec file to
change without the compiler executable itself changing.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit d3f9f39)
This commit updates the twister workflow such that ccache only uses remote
Redis cache storage when available.

The purpose of this to reduce the individual runner local disk IOPS
requirement; thereby, reducing the overall load on the SAN.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 527435d)
This commit sets the twister timeout multiplier to 2, which effectively
increases the default test timeout from 60 to 120 seconds, because the new
cost-effective Zephyr runners may take longer to execute tests and the
default timeout is not sufficient for some tests to complete.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit de68ea7)
This commit increases the twister build job timeout from the default value
of 6 hours to 24 hours because scheduled (weekly) build runs take longer
than 6 hours to complete.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 7df7e83)
Increase matrix size to 20 from 15 on push events.

Signed-off-by: Anas Nashif <[email protected]>
(cherry picked from commit 9970724)
This commit updates the doc-build workflow to use the new zephyr-runner v2
CI runner deployment.

It also installs additional system packages that are not available by
default in the zephyr-runner v2.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 2819c35)
This commit updates the bsim-tests workflow to use the new zephyr-runner v2
CI runner deployment.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 9838633)
This commit updates the codecov workflow to run on all forks under the
zephyrproject-rtos organisation.

The purpose of this is mainly to simplify the process of testing of this
workflow under the zephyr-testing repository.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit c1bd5a6)
This commit updates the codecov workflow to use the new zephyr-runner v2 CI
runner deployment.

It also updates the workflow to use the `ci-repo-cache` Docker image, which
includes the Zephyr repository cache, because the node level repository
cache is no longer available in the zephyr-runner v2.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 354e290)
This commit updates the codecov workflow to store ccache data in the
zephyr-runner v2 node cache.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 36b0b10)
This commit updates the codecov workflow to, when available, use Redis
remote storage backend for the ccache compilation cache data.

The Redis cache server is hosted in the Kubernetes cluster in which the
zephyr-runner pods run -- the Redis remote storage backend will be ignored
if the server is unavailable.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit b57f1b5)
This commit adds the compiler `--specs=*` flag to the ccache ignore option
list because ccache is unable to resolve the toolchain-provided specs file
path and will consider source files to be uncacheable if it is unable to
read the specified specs file.

Note that adding `--specs=*` to the ignore option list is not a problem
because it is unlikely for the content of the toolchain libc spec file to
change without the compiler executable itself changing.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit ab9f6b4)
This commit updates the codecov workflow such that ccache only uses remote
Redis cache storage when available.

The purpose of this to reduce the individual runner local disk IOPS
requirement; thereby, reducing the overall load on the SAN.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit a636c52)
This commit sets the codecov workflow twister timeout multiplier to 2,
which effectively increases the default test timeout from 60 to 120
seconds, because the new cost-effective Zephyr runners may take longer to
execute tests and the default timeout is not sufficient for some tests to
complete.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 550bb4e)
This commit updates the clang workflow to use the new zephyr-runner v2 CI
runner deployment.

It also updates the workflow to use the `ci-repo-cache` Docker image, which
includes the Zephyr repository cache, because the node level repository
cache is no longer available in the zephyr-runner v2.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 64ca699)
This commit updates the clang workflow to store ccache data in the
zephyr-runner v2 node cache.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit cd83f07)
This commit updates the clang workflow to, when available, use Redis remote
storage backend for the ccache compilation cache data.

The Redis cache server is hosted in the Kubernetes cluster in which the
zephyr-runner pods run -- the Redis remote storage backend will be ignored
if the server is unavailable.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 4a2884c)
This commit updates the clang workflow such that ccache only uses remote
Redis cache storage when available.

The purpose of this to reduce the individual runner local disk IOPS
requirement; thereby, reducing the overall load on the SAN.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 95e7eb3)
This commit updates the bsim-tests workflow to use the new zephyr-runner v2
CI runner deployment.

It also updates the workflow to use the `ci-repo-cache` Docker image, which
includes the Zephyr repository cache, because the node level repository
cache is no longer available in the zephyr-runner v2.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 9f9a6c5)
This commit increases the default value of `ZTEST_TEST_DELAY_MS` from 3000
to 5000 milliseconds because the current value of 3000ms may not be
sufficient for the hosts with lower CPU clock frequency (e.g. new Zephyr CI
runners with cost-effective processors).

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 1bf7510)
This commit relaxes the idle event statistics test precision requirement
for emulated QEMU targets because the cycle counts may be inaccurate when
the host CPU is overloaded (e.g. when running tests with twister) and a
high failure rate is observed for this test in the CI.

Signed-off-by: Stephanos Ioannidis <[email protected]>
(cherry picked from commit 2b2dd01)
@stephanosio
Copy link
Member Author

"Backport Issue Check" failure should be overridden since this is strictly a CI housekeeping change and backport rules do not apply.

@stephanosio stephanosio force-pushed the v3.6-branch-ci-zephyr-runner-v2 branch 3 times, most recently from 372fe4a to bf36753 Compare March 13, 2024 01:47
@henrikbrixandersen henrikbrixandersen added this to the v3.6.1 milestone Apr 3, 2024
@henrikbrixandersen henrikbrixandersen added the Backport Backport PR and backport failure issues label Apr 3, 2024
@stephanosio stephanosio merged commit 597ab83 into zephyrproject-rtos:v3.6-branch Apr 24, 2024
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

8 participants