Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve validation and robustness around volumes management declared in gadget.yaml #217

Merged
merged 11 commits into from
Jun 24, 2024

Conversation

upils
Copy link
Collaborator

@upils upils commented Apr 19, 2024

The gadget.yaml supports having no volumes or several ones. This PR fixes several issues/shortcomings around this feature:

  • It was not properly checked at least one volume was present (in the u-i context at least one is mandatory)
  • The Validate() function provided by the snap/gadget lib was not used. So basic consistency checks were missing (ensuring no duplicate roles, etc.)
  • the call to fixMissingSystemData in postProcessGadgetYaml was done considering the last volume seen previously was the "main one". But it could led to a panic if no volume was defined in the gadget.yaml. This is now explicit that the last volume is used, until we add a way to identify the "main volume".

As a consequence of these changes the validation on the gadget.yaml is stricter. So maybe some users might encounter errors but these would be legitimate bugs in their gadget.yaml and we should help them fixing these.

This PR also contains a small fix on the badge in the Readme to get accurate report on the landing page of the project on GitHub.

@upils upils changed the title Improve validation and robustness around volumes declared in gadget.yaml Improve validation and robustness around volumes management declared in gadget.yaml Apr 19, 2024
@upils upils force-pushed the fix-no-volume-in-gadget-yaml branch 2 times, most recently from 11187b8 to 9aef451 Compare April 19, 2024 14:07
Copy link

codecov bot commented Apr 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.25%. Comparing base (8c11aaf) to head (c225466).
Report is 20 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #217      +/-   ##
==========================================
- Coverage   94.27%   94.25%   -0.03%     
==========================================
  Files          16       16              
  Lines        3284     3341      +57     
==========================================
+ Hits         3096     3149      +53     
- Misses        120      123       +3     
- Partials       68       69       +1     
Flag Coverage Δ
unittests 94.25% <100.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@upils upils self-assigned this Apr 22, 2024
@upils upils force-pushed the fix-no-volume-in-gadget-yaml branch 2 times, most recently from 72a391e to d74f0c7 Compare April 25, 2024 07:41
@upils
Copy link
Collaborator Author

upils commented Apr 25, 2024

In the future I would like to add a test of a real use case with 2 volumes in the gadget.yaml. AFAIK there is no current usage of ubuntu-image 3.X with several volumes. So I should keep an eye out for it. But I do not want to further delay merging this.

@upils upils requested a review from sil2100 April 25, 2024 07:44
@upils
Copy link
Collaborator Author

upils commented Apr 25, 2024

@alfonsosanchezbeato would you mind reviewing this too? Specifically the use of the Validate() function from the gadget lib and changes in fixMissingSystemData().

@upils upils force-pushed the fix-no-volume-in-gadget-yaml branch from d74f0c7 to c225466 Compare June 18, 2024 12:49
Copy link
Collaborator

@sil2100 sil2100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gadget bits, clean-up, validation and checks are good. I have some questions regarding the newly added MainVolumeName variable. See inline.

@@ -129,6 +129,9 @@ type StateMachine struct {
// names of images for each volume
VolumeNames map[string]string

// name of the "main volume"
MainVolumeName string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am possibly missing something here so I apologize if this is obvious, but how is the MainVolumeName field used? I see it being set, but... I don't actually see it being consumed anywhere? What did I miss?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No you are completely right, this is currently unused. This was done on purpose. When working on this and talking with Alfonso, we came to the conclusion that the "main volume" should be the one with the system-seed role. And calling gadget.Validate() (which we did not do before) ensure there is a single structure with this role.

BUT, considering this one to be the "main one" is not really standard and I am not 100% sure it is the currently the case for our users. See the comment here https://github.com/canonical/ubuntu-image/pull/217/files#diff-c31b17e50caf5e415f8b656ff1a4d6ca2bcaf9460e272407bebbe3279090d5f9R405

So my idea was to make it explicit in the code for now that we decide what is the main volume, and then later, when we are sure that the new validations this PR introduce are not breaking anything (or that users fixed their gadgets) we can move to the next step and enforce it. So for now we keep the fact that the last volume is the "main one".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thank you for the explanation!

}
stateMachine.IsSeeded = true
// The "main" volume is the one with a system-seed structure
stateMachine.MainVolumeName = structure.VolumeName
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A follow up to the previous question about usage of MainVolumeName, it looks like it's only set for the Ubuntu Core case (where we have seeded images). Is this field only somehow used in the UC20+ case?

Copy link
Collaborator Author

@upils upils Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is set when calling postProcessGadgetYaml, which is a state function for both classic and snap builds. I did not add any check for the version of Ubuntu Core and I am not aware that would only be valid for UC20+. So it should always be set, as long as there is at least one volume (which we also make sure of now with validateVolumes()).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the reason why I commented on this and mentioned UC20+ images is that basically we set no MainVolumeName for classic cases. For non core images we have no system-seed partitions (at least the UC20+ ones, since UC18 also didn't use the seed).
I like the idea to consider the volume with the seed as the main volume, but in this case we basically say that we 'do not consider any volume as main for other cases'. What I was thinking that maybe, before we get better logic, for other cases we should consider the volume that carries the system-data partition to be the main one.

Not that it really matters as, like you mentioned in the other comment, this is not currently used. But I'd like us to be consistent and use it throughout, as this is done in a step that is not core-specific.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, I forgot about that! I agree that making the volume holding the system-data the main one seems to make sense. But currently the goal of identifying the main volume is to know where (in the future) to add the system-data structure if it is missing. So in the classic case, intentionally identifying the main volume by looking for the system-data role is not very useful (it is a kind of a deadlock :)). We also already do that and storing it in the rootfsSeen var and then use it in fixMissingSystemData(). Currently we also check there is only one volume in the gadget info, before adding the new system-data structure.

So I suppose the next step would be to find a better heuristic to identify the main volume and be able to add a system-data to the right one.

I agree having a consistent behavior is good, but here I think that would be a bit confusing. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, we can discuss this further when the need for MainVolumeName appears!

@upils
Copy link
Collaborator Author

upils commented Jun 20, 2024

For now this is not possible to ignore some vulns with govulncheck (see golang/go#59507). This is why the Dependencies vulnerability check is failing (because a vuln in snapd is not declared fixed even though it is in the version we use).

@upils upils requested a review from sil2100 June 21, 2024 07:17
@sil2100 sil2100 merged commit b7a18b7 into main Jun 24, 2024
11 of 15 checks passed
@sil2100 sil2100 deleted the fix-no-volume-in-gadget-yaml branch June 24, 2024 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants