-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvm: ref-count storage pool usage #9498
base: 4.19
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.19 #9498 +/- ##
===========================================
Coverage 15.10% 15.10%
- Complexity 11220 11225 +5
===========================================
Files 5404 5404
Lines 473460 473486 +26
Branches 57728 59047 +1319
===========================================
+ Hits 71525 71541 +16
- Misses 393941 393948 +7
- Partials 7994 7997 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clgtm, you do have a good test scenario for this, do you @rp- ? Or is it only intermitted (i.e. not automatable)
I'm not sure it is easy to reproducible automate that, as it is a timing/parallelism issue. But we have 2 customers who didn't report any issues with this yet. |
@blueorangutan package |
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10620 |
@blueorangutan test |
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
[SF] Trillian test result (tid-11065)
|
@blueorangutan package |
@rohityadavcloud a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 10927 |
Packaging result [SF]: ✖️ el8 ✖️ el9 ✔️ debian ✖️ suse15. SL-JID 10950 |
I guess the failed packaging is nothing related to this PR? |
looks like a test was too slow , so might have to do with too busy container. retrying @rp- |
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10988 |
@blueorangutan test |
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
[SF] Trillian test result (tid-11364)
|
@blueorangutan package |
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11033 |
@blueorangutan test |
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
[SF] Trillian test result (tid-11408)
|
...ns/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java
Outdated
Show resolved
Hide resolved
...ns/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java
Outdated
Show resolved
Hide resolved
c599a58
to
b896bbc
Compare
@blueorangutan package |
@rp- a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11516 |
@blueorangutan test |
The blue ape won't listen to you if you tell it to run tests @rp- (to prevent our lab getting to full we keep control on regression test runs). Other commands will work (like 'package' and 'ui') It should have told you after your attempt ??? |
@DaanHoogland [SL] unsupported parameters provided. Supported mgmt server os are: |
No it didn't tell me, but I was guessing that I don't have the "rights" to do it. |
@blueorangutan test |
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
@rp- did you test your solution with adding and removing primary storages? |
My fear is that we might have many references to a primary storage and when trying to delete it ACS will not umount it, even if it should. |
[SF] Trillian test result (tid-11765)
|
@JoaoJandre Adding a I'm not really sure how to move forward with this then, maybe don't let the management server create 2 jobs with the same secondary storage pool to the same agent in parallel. |
We could keep the ref-count for the secondary storage only. The |
And on delete? only check if the uuid is part of the refcount map? |
@rp- Yeah, if it is not part of the map, we assume that it can be deleted. If it is, we check the refcount. |
b896bbc
to
427ed5a
Compare
@blueorangutan package |
@JoaoJandre This seem to work in my tests now |
427ed5a
to
7064436
Compare
If a secondary storage pool is used by e.g. 2 concurrent snapshot->template actions, if the first action finished it removed the netfs mount point for the other action. Now the storage pools are usage ref-counted and will only deleted if there are no more users.
7064436
to
20735be
Compare
@blueorangutan package |
@rp- a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11533 |
Description
If a storage pool is used by e.g. 2 concurrent snapshot->template actions, if the first action finished it removed the netfs mount point for the other action.
Now the storage pools are usage ref-counted and will only deleted if there are no more users.
Fixes: #8899
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
Run several snapshot to template actions, that are executed on the same host.
How did you try to break this feature and the system with this change?