Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace cloud-init validation with external script #2079

Conversation

Chris-Peterson444
Copy link
Contributor

Slightly more changes for one PR than I'd like but everything sort of fell together:

  • Extract cloud-init schema validation to a new system_script subiquity-legacy-cloud-init-validate. This is called "legacy" because, in the future, we likely want to rely on the cloud-init CLI directly to do schema validation, but there isn't a pressing need to rely on it now. For now, all releases will rely on the "legacy" script.

  • Rename the old validate_cloud_init_schema function to validate_cloud_init_top_level_keys (along with some other function and type renaming). The function wasn't really validating the whole cloud-init schema. It's really just parsing the output for any top-level keys that failed to validate.

  • Skip the top-level key validation during autoinstall extraction in cloud-config on releases in which the "schema" subcommand wasn't available.

  • Add a default value of the SNAP environment variable for dry-run and tests. This lets code paths that rely on system_scripts_env to not fail in dry-run or testing, which is mandatory now that subiquity-legacy-cloud-init-validate is used for cloud-init schema validation.

Copy link
Collaborator

@dbungert dbungert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What sort of VM testing has happened here?

subiquity/cloudinit.py Outdated Show resolved Hide resolved
subiquity/server/server.py Outdated Show resolved Hide resolved
subiquitycore/tests/__init__.py Outdated Show resolved Hide resolved
@Chris-Peterson444
Copy link
Contributor Author

What sort of VM testing has happened here?

I ran test installs on jammy, focal, noble, and oracular injected with a snap with the proposed changes.

I tested with some userdata that would produce deprecation warnings on newer version of cloud-init e.g.,

chpasswd:
  list: ""

to make sure it would result in correct warnings in the debug log:

2024-09-16 21:38:13,999 WARNING subiquity.cloudinit:244 The cloud-init configuration for autoinstall.user-data contains deprec
ated values:
Deprecated in version 22.2. Use ``users`` instead.

I also tested that autoinstall with invalid userdata e.g.,

    ch:
      list: ""

would cause failures:

2024-09-16 21:40:39,027 DEBUG subiquitycore.utils:118 run_command ['subiquity-legacy-cloud-init-validate', '--config', '/tmp/tmpgjmcj9sn/test-cloud-config.yaml', '--source', 'autoinstall.user-data'] exited with code 0
2024-09-16 21:40:39,029 ERROR root:38 finish: subiquity/Userdata/load_autoinstall_data: FAIL: Cloud config schema errors: autoinstall.user-data.ch: Additional properties are not allowed ('ch' was unexpected)

The strictness of the validation is unfortunately not consistent across all versions and relies on the behavior of the shipped cloud-init version. For example, the above faulty userdata section validates fine on the focal ISOs I tested. Likely due to not enforcing strict top level keys yet.

I also confirmed that top-level key validation we perform when extracting autoinstall is skipped on older releases that don't support the schema subcommand of cloud-init, e.g. on focal:

2024-09-16 21:43:23,071 DEBUG subiquity.cloudinit:78 cloud-init version: 20.1
2024-09-16 21:43:23,071 DEBUG subiquity.cloudinit:175 Host cloud-config doesn't support 'schema' subcommand. Skipping top-level key cloud-config validation.

but still works on later releases:

2024-09-16 22:39:21,914 DEBUG subiquity.cloudinit:78 cloud-init version: 24.3
2024-09-16 22:39:21,914 DEBUG subiquitycore.utils:141 arun_command called: ['cloud-init', 'schema', '--system']
2024-09-16 22:39:25,012 DEBUG subiquitycore.utils:155 arun_command ['cloud-init', 'schema', '--system'] exited with code 1
2024-09-16 22:39:25,013 WARNING subiquity.server.server:129 cloud-init schema validation failure for: 'not_a_valid_key'
2024-09-16 22:39:25,014 DEBUG curtin.reporting.warning.subiquity/load_cloud_config/extract_autoinstall:45 warning: subiquity/l
oad_cloud_config/extract_autoinstall: cloud-init schema validation failure for: 'not_a_valid_key'
2024-09-16 22:39:25,015 DEBUG subiquity.server.server:836 No autoinstall keys found among bad cloud config. Continuing.
2024-09-16 22:39:25,015 DEBUG curtin.reporting.finish.subiquity/load_cloud_config/extract_autoinstall:45 finish: subiquity/loa
d_cloud_config/extract_autoinstall: SUCCESS:

Note: interactive installs may fail on Oracular due to an unrelated UI bug when passing autoinstall and having an interactive section. I just discovered it when re-running tests to get output. I'll open a bug shortly.

@Chris-Peterson444 Chris-Peterson444 force-pushed the refactor-cloud-init-validation branch 3 times, most recently from 935897b to 02430e4 Compare September 17, 2024 01:09
@Chris-Peterson444
Copy link
Contributor Author

Note: interactive installs may fail on Oracular due to an unrelated UI bug when passing autoinstall and having an interactive section. I just discovered it when re-running tests to get output. I'll open a bug shortly.

PR and bug for the issue: #2085

Copy link
Collaborator

@mwhudson mwhudson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. a few niggles but nothing serious.

system_scripts/subiquity-legacy-cloud-init-validate Outdated Show resolved Hide resolved
subiquity/cloudinit.py Outdated Show resolved Hide resolved
subiquity/models/tests/test_subiquity.py Outdated Show resolved Hide resolved
We need to define the SNAP environment variable in dry-run mode so
the `system_scripts_env` function can setup a path to the system_scripts
directory. This is necessary for the subiquity-legacy-cloud-init
-validate script.
Rename validate_cloud_init_schema to validate_cloud_init_top_level_keys
and rename get_schema_failure_keys to get_unknown_keys.

validate_cloud_init_schema wasn't really validating the whole
cloud-init schema, it's really just parsing the output for any
top-level keys that failed to validate. Let's rename it to what
it actually is doing: checking for bad top level keys.

Also renames the original CloudInitSchemaValidationError to
CloudInitSchemaTopLevelKeyError to reflect this change.
Extract cloud-init schema validation to a new system_script
`subiquity-legacy-cloud-init-validate`. This is called "legacy" because,
in the future, we likely want to rely on the cloud-init CLI directly to
do schema validation, but there isn't a pressing need to rely on it now.
For now, all releases will rely on the "legacy" script.

Also let `get_unknown_keys` use `system_scripts_env` for the command
environment to rely on the system cloud-init version.
The "schema" subcommand was introduced in cloud-init 22.2 (see
cloud-init commit [0] ), which was released April 27, 2022. This means
older releases of Focal and Jammy, which are currently supported LTSes,
will fail this code path after we unstage cloud-init from the Subiquity
snap. Add a check to skip this validation if a newer version of
cloud-init is detected.

[0] 3bcffacb216d683241cf955e4f7f3e89431c1491
@Chris-Peterson444 Chris-Peterson444 merged commit 0e9ccb5 into canonical:main Sep 18, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants