From 95911263bbbf493d729b9322e66be984aa858324 Mon Sep 17 00:00:00 2001 From: Yevgeny Shnaidman <60741237+yevgeny-shnaidman@users.noreply.github.com> Date: Mon, 8 Apr 2024 16:22:45 +0300 Subject: [PATCH] Support cluster upgrade for day1 scenario (#1076) the kernel module image's tag should be equal to kernel version of the node. Pull image will get the current kernel version of the node and use this value as a tag for image to pull --- .../documentation/day1_limited_option.md | 46 ++++++++++++++++--- pkg/mcproducer/scripts/pull-image.sh | 21 +++++---- .../testdata/machineconfig-test.yaml | 2 +- 3 files changed, 52 insertions(+), 17 deletions(-) diff --git a/docs/mkdocs/documentation/day1_limited_option.md b/docs/mkdocs/documentation/day1_limited_option.md index 52b127be9..48878c6cf 100644 --- a/docs/mkdocs/documentation/day1_limited_option.md +++ b/docs/mkdocs/documentation/day1_limited_option.md @@ -33,15 +33,23 @@ The loading of OOT kernel module leverages MCO. The flow sequence is as follows: The Day 1 functionality uses the same DTK based image that Day 2 KMM builds can leverage. OOT kernel module should be located under `/opt/lib/modules/${kernelVersion}`. +The tag of the kernel module image should be equal to kernel version on the node: for example, +if the kernel version on the node is `5.14.0-284.59.1.el9_2.x86_64`, then the image and tag should be: +`repo/image:5.14.0-284.59.1.el9_2.x86_64` ## In-tree module replacement -The Day 1 functionality always tries to replace the in-tree kernel module with the OOT one. -If the in-tree kernel module is not loaded, the flow is not affected;, the service will proceed and load the OOT kernel module. +The Day 1 functionality will try to replace in-tree kernel module only if requested (see parameter to the MC creation). +If the in-tree kernel module is not loaded, but was requested to be unloaded, the flow is not affected; +the service will proceed and load the OOT kernel module. ## MCO yaml creation -KMM provides an API to create an MCO YAML manifest for the Day 1 functionality: +KMM provides 2 ways to create an MCO YAML manifest for the Day 1 functionality: +1. API to be called by from GO code +2. Linux executable that can be called manually with appropriate parameters + +### API ```go ProduceMachineConfig(machineConfigName, machineConfigPoolRef, kernelModuleImage, kernelModuleName string) (string, error) @@ -54,15 +62,36 @@ The parameters are: - `machineConfigName`: the name of the MCO YAML manifest. It will be set as the `name` parameter of the metadata of MCO YAML manifest. - `machineConfigPoolRef`: the `MachineConfigPool` name that will be used in order to identify the targeted nodes -- `kernelModuleImage`: the name of the container image that includes the OOT kernel module. +- `kernelModuleImage`: the name of the container image that includes the OOT kernel module without the tag - `kernelModuleName`: the name of the OOT kernel module. This parameter will be used both to unload the in-tree kernel module (if loaded into the kernel) and to load the OOT kernel module. - -The API is located under `pkg/mcproducer` package of hte KMM source code. +- `inTreeModuleToRemove`: optional parameter. The name of the in-tree kernel module to unload prior to loading OOT kernel module. + In case this parameter is not passed, day1 functionality will not try to unload any in-tree + module +- `workerImage`: optional parameter. The worker image to use. In case this parameter is not passed, the default worker image + will be used: quay.io/edge-infrastructure/kernel-module-management-worker:latest. + + +The API is located under `pkg/mcproducer` package of the KMM source code. There is no need to KMM operator to be running to use the Day 1 functionality. Users only need to import the `pkg/mcproducer` package into their operator/utility code, call the API and to apply the produced MCO YAML to the cluster. +### Utility +`day1-utility` can be called from a shell. day1-utility executable is not a part of KMM github repo. +In order to build it the following commands needs to be run: +`make day1-utility` + +Utility uses the following flags: +`-image `: container image that contains kernel module .ko file +`-kernel-module `: name of the OOT module to load +`-machine-config `: name of the machine config to create +`-machine-config-pool `: name of the machine config pool to use +`-in-tree-module-to-remove `: in-tree kernel module that should be removed prior to loading the oot module. +`-worker-image `: kernel-management worker image to use. If not passed, a default value will be used + +The first 4 flags are mandatory, but the last 2 are optional. They correspond to the parameters of the API + ### MachineConfigPool MachineConfigPool is used to identify a collection of nodes that will be affected by the applied MCO. @@ -106,5 +135,8 @@ will target the worker MachineConfigPool A detailed description of MachineConfig and MachineConfigPool can be found in [MachineConfigPool explanation](https://www.redhat.com/en/blog/openshift-container-platform-4-how-does-machine-config-pool-work) for more information. - +## Cluster Upgrade support +Using kernel version as a tag for kernel module image, allows supporting cluster upgrade. Pull service will determine the kernel version of the +node and then use this value as a tag for kernel module image. This way, all the customer needs to do prior to upgrading the cluster, it to create a kernel module image +with the appropriate tag, without any need to update day1 MC. Once the node is rebooted, pull service will pull the correct image diff --git a/pkg/mcproducer/scripts/pull-image.sh b/pkg/mcproducer/scripts/pull-image.sh index d0c7ac901..50b7e2784 100644 --- a/pkg/mcproducer/scripts/pull-image.sh +++ b/pkg/mcproducer/scripts/pull-image.sh @@ -4,10 +4,13 @@ kernel_module_image_filepath="$KERNEL_MODULE_IMAGE_FILEPATH" worker_image="$WORKER_IMAGE" kernel_module_image="$KERNEL_MODULE_IMAGE" +kernel_module_image_tag=$(uname -r) +full_kernel_module_image="$kernel_module_image:$kernel_module_image_tag" if [ -e $kernel_module_image_filepath ]; then - echo "File $kernel_module_image_filepath found.Nothing to do, the file was handled, removing it" + echo "File $kernel_module_image_filepath found. Nothing to do, the file was handled, removing $kernel_module_image_filepath and $kmm_config_file_filepath" rm -f $kernel_module_image_filepath + rm -f $kmm_config_file_filepath else podman pull --authfile /var/lib/kubelet/config.json $worker_image if [ $? -eq 0 ]; then @@ -17,20 +20,20 @@ else exit 1 fi - echo "File $kernel_module_image_filepath is not on the filesystem, pulling image " - podman pull --authfile /var/lib/kubelet/config.json $kernel_module_image + echo "File $kernel_module_image_filepath is not on the filesystem, pulling image $full_kernel_module_image" + podman pull --authfile /var/lib/kubelet/config.json $full_kernel_module_image if [ $? -eq 0 ]; then - echo "Image $kernel_module_image has been successfully pulled" + echo "Image $full_kernel_module_image has been successfully pulled" else - echo "Failed to pull image $kernel_module_image" + echo "Failed to pull image $full_kernel_module_image" exit 1 fi - echo "Saving image $kernel_module_image into a file $kernel_module_image_filepath" - podman save -o $kernel_module_image_filepath $kernel_module_image + echo "Saving image $full_kernel_module_image into a file $kernel_module_image_filepath" + podman save -o $kernel_module_image_filepath $full_kernel_module_image if [ $? -eq 0 ]; then - echo "Image $kernel_module_image has been successfully save on file $kernel_module_image_filepath, rebooting..." + echo "Image $full_kernel_module_image has been successfully save on file $kernel_module_image_filepath, rebooting..." reboot else - echo "Failed to save image $kernel_module_image to file $kernel_module_image_filepath" + echo "Failed to save image $full_kernel_module_image to file $kernel_module_image_filepath" fi fi diff --git a/pkg/mcproducer/testdata/machineconfig-test.yaml b/pkg/mcproducer/testdata/machineconfig-test.yaml index 11cff9ecc..7ec266579 100644 --- a/pkg/mcproducer/testdata/machineconfig-test.yaml +++ b/pkg/mcproducer/testdata/machineconfig-test.yaml @@ -73,7 +73,7 @@ spec: user: name: "root" contents: - source: "data:text/plain;base64,IyEvYmluL2Jhc2gKCgprZXJuZWxfbW9kdWxlX2ltYWdlX2ZpbGVwYXRoPSIkS0VSTkVMX01PRFVMRV9JTUFHRV9GSUxFUEFUSCIKd29ya2VyX2ltYWdlPSIkV09SS0VSX0lNQUdFIgprZXJuZWxfbW9kdWxlX2ltYWdlPSIkS0VSTkVMX01PRFVMRV9JTUFHRSIKCmlmIFsgLWUgJGtlcm5lbF9tb2R1bGVfaW1hZ2VfZmlsZXBhdGggXTsgdGhlbgogICAgZWNobyAiRmlsZSAka2VybmVsX21vZHVsZV9pbWFnZV9maWxlcGF0aCBmb3VuZC5Ob3RoaW5nIHRvIGRvLCB0aGUgZmlsZSB3YXMgaGFuZGxlZCwgcmVtb3ZpbmcgaXQiCiAgICBybSAtZiAka2VybmVsX21vZHVsZV9pbWFnZV9maWxlcGF0aAplbHNlCiAgICBwb2RtYW4gcHVsbCAtLWF1dGhmaWxlIC92YXIvbGliL2t1YmVsZXQvY29uZmlnLmpzb24gJHdvcmtlcl9pbWFnZQogICAgaWYgWyAkPyAtZXEgMCBdOyB0aGVuCiAgICAgICAgZWNobyAiSW1hZ2UgJHdvcmtlcl9pbWFnZSBoYXMgYmVlbiBzdWNjZXNzZnVsbHkgcHVsbGVkIgogICAgZWxzZQogICAgICAgIGVjaG8gIkZhaWxlZCB0byBwdWxsIGltYWdlICR3b3JrZXJfaW1hZ2UiCiAgICAgICAgZXhpdCAxCiAgICBmaQoKICAgIGVjaG8gIkZpbGUgJGtlcm5lbF9tb2R1bGVfaW1hZ2VfZmlsZXBhdGggaXMgbm90IG9uIHRoZSBmaWxlc3lzdGVtLCBwdWxsaW5nIGltYWdlICIKICAgIHBvZG1hbiBwdWxsIC0tYXV0aGZpbGUgL3Zhci9saWIva3ViZWxldC9jb25maWcuanNvbiAka2VybmVsX21vZHVsZV9pbWFnZQogICAgaWYgWyAkPyAtZXEgMCBdOyB0aGVuCiAgICAgICAgZWNobyAiSW1hZ2UgJGtlcm5lbF9tb2R1bGVfaW1hZ2UgaGFzIGJlZW4gc3VjY2Vzc2Z1bGx5IHB1bGxlZCIKICAgIGVsc2UKICAgICAgICBlY2hvICJGYWlsZWQgdG8gcHVsbCBpbWFnZSAka2VybmVsX21vZHVsZV9pbWFnZSIKICAgICAgICBleGl0IDEKICAgIGZpCiAgICBlY2hvICJTYXZpbmcgaW1hZ2UgJGtlcm5lbF9tb2R1bGVfaW1hZ2UgaW50byBhIGZpbGUgJGtlcm5lbF9tb2R1bGVfaW1hZ2VfZmlsZXBhdGgiCiAgICBwb2RtYW4gc2F2ZSAtbyAka2VybmVsX21vZHVsZV9pbWFnZV9maWxlcGF0aCAka2VybmVsX21vZHVsZV9pbWFnZQogICAgaWYgWyAkPyAtZXEgMCBdOyB0aGVuCiAgICAgICAgZWNobyAiSW1hZ2UgJGtlcm5lbF9tb2R1bGVfaW1hZ2UgaGFzIGJlZW4gc3VjY2Vzc2Z1bGx5IHNhdmUgb24gZmlsZSAka2VybmVsX21vZHVsZV9pbWFnZV9maWxlcGF0aCwgcmVib290aW5nLi4uIgogICAgICAgIHJlYm9vdAogICAgZWxzZQogICAgICAgIGVjaG8gIkZhaWxlZCB0byBzYXZlIGltYWdlICRrZXJuZWxfbW9kdWxlX2ltYWdlIHRvIGZpbGUgJGtlcm5lbF9tb2R1bGVfaW1hZ2VfZmlsZXBhdGgiCiAgICBmaQpmaQo=" + source: "data:text/plain;base64,IyEvYmluL2Jhc2gKCgprZXJuZWxfbW9kdWxlX2ltYWdlX2ZpbGVwYXRoPSIkS0VSTkVMX01PRFVMRV9JTUFHRV9GSUxFUEFUSCIKd29ya2VyX2ltYWdlPSIkV09SS0VSX0lNQUdFIgprZXJuZWxfbW9kdWxlX2ltYWdlPSIkS0VSTkVMX01PRFVMRV9JTUFHRSIKa2VybmVsX21vZHVsZV9pbWFnZV90YWc9JCh1bmFtZSAtcikKZnVsbF9rZXJuZWxfbW9kdWxlX2ltYWdlPSIka2VybmVsX21vZHVsZV9pbWFnZToka2VybmVsX21vZHVsZV9pbWFnZV90YWciCgppZiBbIC1lICRrZXJuZWxfbW9kdWxlX2ltYWdlX2ZpbGVwYXRoIF07IHRoZW4KICAgIGVjaG8gIkZpbGUgJGtlcm5lbF9tb2R1bGVfaW1hZ2VfZmlsZXBhdGggZm91bmQuIE5vdGhpbmcgdG8gZG8sIHRoZSBmaWxlIHdhcyBoYW5kbGVkLCByZW1vdmluZyAka2VybmVsX21vZHVsZV9pbWFnZV9maWxlcGF0aCBhbmQgJGttbV9jb25maWdfZmlsZV9maWxlcGF0aCIKICAgIHJtIC1mICRrZXJuZWxfbW9kdWxlX2ltYWdlX2ZpbGVwYXRoCiAgICBybSAtZiAka21tX2NvbmZpZ19maWxlX2ZpbGVwYXRoCmVsc2UKICAgIHBvZG1hbiBwdWxsIC0tYXV0aGZpbGUgL3Zhci9saWIva3ViZWxldC9jb25maWcuanNvbiAkd29ya2VyX2ltYWdlCiAgICBpZiBbICQ/IC1lcSAwIF07IHRoZW4KICAgICAgICBlY2hvICJJbWFnZSAkd29ya2VyX2ltYWdlIGhhcyBiZWVuIHN1Y2Nlc3NmdWxseSBwdWxsZWQiCiAgICBlbHNlCiAgICAgICAgZWNobyAiRmFpbGVkIHRvIHB1bGwgaW1hZ2UgJHdvcmtlcl9pbWFnZSIKICAgICAgICBleGl0IDEKICAgIGZpCgogICAgZWNobyAiRmlsZSAka2VybmVsX21vZHVsZV9pbWFnZV9maWxlcGF0aCBpcyBub3Qgb24gdGhlIGZpbGVzeXN0ZW0sIHB1bGxpbmcgaW1hZ2UgJGZ1bGxfa2VybmVsX21vZHVsZV9pbWFnZSIKICAgIHBvZG1hbiBwdWxsIC0tYXV0aGZpbGUgL3Zhci9saWIva3ViZWxldC9jb25maWcuanNvbiAkZnVsbF9rZXJuZWxfbW9kdWxlX2ltYWdlCiAgICBpZiBbICQ/IC1lcSAwIF07IHRoZW4KICAgICAgICBlY2hvICJJbWFnZSAkZnVsbF9rZXJuZWxfbW9kdWxlX2ltYWdlIGhhcyBiZWVuIHN1Y2Nlc3NmdWxseSBwdWxsZWQiCiAgICBlbHNlCiAgICAgICAgZWNobyAiRmFpbGVkIHRvIHB1bGwgaW1hZ2UgJGZ1bGxfa2VybmVsX21vZHVsZV9pbWFnZSIKICAgICAgICBleGl0IDEKICAgIGZpCiAgICBlY2hvICJTYXZpbmcgaW1hZ2UgJGZ1bGxfa2VybmVsX21vZHVsZV9pbWFnZSBpbnRvIGEgZmlsZSAka2VybmVsX21vZHVsZV9pbWFnZV9maWxlcGF0aCIKICAgIHBvZG1hbiBzYXZlIC1vICRrZXJuZWxfbW9kdWxlX2ltYWdlX2ZpbGVwYXRoICRmdWxsX2tlcm5lbF9tb2R1bGVfaW1hZ2UKICAgIGlmIFsgJD8gLWVxIDAgXTsgdGhlbgogICAgICAgIGVjaG8gIkltYWdlICRmdWxsX2tlcm5lbF9tb2R1bGVfaW1hZ2UgaGFzIGJlZW4gc3VjY2Vzc2Z1bGx5IHNhdmUgb24gZmlsZSAka2VybmVsX21vZHVsZV9pbWFnZV9maWxlcGF0aCwgcmVib290aW5nLi4uIgogICAgICAgIHJlYm9vdAogICAgZWxzZQogICAgICAgIGVjaG8gIkZhaWxlZCB0byBzYXZlIGltYWdlICRmdWxsX2tlcm5lbF9tb2R1bGVfaW1hZ2UgdG8gZmlsZSAka2VybmVsX21vZHVsZV9pbWFnZV9maWxlcGF0aCIKICAgIGZpCmZpCg==" - path: "/usr/local/bin/wait-for-dispatcher.sh" mode: 493 overwrite: true