Fix rbd pools trash purge scheduling #2357

katarimanojk · 2024-09-17T06:46:34Z

This patch ensures to use the right conditions
and parameters while trash purge scheduling on
the rbd pools.

Jira: https://issues.redhat.com/browse/OSPRH-8504

openshift-ci · 2024-09-17T06:46:42Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign pablintino for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

github-actions · 2024-09-17T06:46:48Z

Thanks for the PR! ❤️
I'm marking it as a draft, once your happy with it merging and the PR is passing CI, click the "Ready for review" button below.

softwarefactory-project-zuul · 2024-09-17T09:50:28Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/c8ba8105a4a94ca8954091b9cd82167d

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 46m 40s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 16m 23s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 30m 57s
❌ podified-multinode-hci-deployment-crc RETRY_LIMIT Host unreachable in 1h 38m 13s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 07s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 10s
✔️ build-push-container-cifmw-client SUCCESS in 31m 14s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 5m 08s

This patch ensures to use the right conditions and parameters while trash purge scheduling on the rbd pools. Jira: https://issues.redhat.com/browse/OSPRH-8504

fmount · 2024-09-18T05:41:30Z

roles/cifmw_cephadm/tasks/pools.yml

    - cifmw_cephadm_enable_trash_scheduler | default(false)
+    - cifmw_cephadm_pools is defined
+    - cifmw_cephadm_pools | length > 0


do we need a | default([]) here ?

I think these checks make | default([]) unnecessary. I think we specifically want to avoid this block if there are no cifmw_cephadm_pools so I think the update of the when is fine. I assume you were just proposing some alternative method for the when and not pointing out a logic problem. If there's a logic problem, then please elaborate.

fmount · 2024-09-18T05:56:31Z

roles/cifmw_cephadm/tasks/pools.yml

      changed_when: false
      become: true
-      loop: "{{ [ cinder_pool.name | default('volumes') ] + cinder_pool.cinder_extra_pools | default([]) }}"
+      loop: "{{ cifmw_cephadm_pools | default([]) }}"


Is the goal to set the rbd trash purge scheduler to all the ceph pools w/ an rbd application set? Or the goal was to enabled it only for cinder pools?

you are right, i think goal here is set it only for cinder pools

in tripleo, cinder_pool was defined in ceph_pools like this
ceph_pools:
cinder_pool:
name: 'volumes'
enabled: true
cinder_extra_pools: [altrbd, pool2, pool3]

in cifmw , we have

cifmw_cephadm_pools:

name: volumes
pg_autoscale_mode: True
target_size_ratio: 0.3
application: rbd

@fultonj
so we have only one pool volumes for cinder, should we set the rbd trash purge scheduler only for that pool and remove the loop ?

I would only set for volumes if is part of the cifmw_cephadm_pools (this should be the when condition). Right now this is wrong because you're setting the trash scheduler for all pools, and I don't think this was the original goal.

+1 to what fmount said.

fmount · 2024-09-18T05:57:38Z

roles/cifmw_cephadm/tasks/pools.yml

  block:
    - name: Get the RBD ceph_cli
      ansible.builtin.include_tasks: ceph_cli.yml
      vars:
        ceph_command: rbd

    - name: Set trash interval
+      when: item.application == 'rbd'


This seems fragile to me, because if by any chance application is not set in the data structure, this task fails w/ application not defined.
am I right? @katarimanojk not sure you can double check this.

I think most of the time people will use the hard coded cifmw_cephadm_pools and I think @katarimanojk was trying to set it for all RBD and exclude it for cephfs. However, I agree with @fmount that only want to set it for cinder, which will be called volumes. Unless someone overrides it, but in that case we'll have no way knowing which is the cinder volume unless we have some other sort of label. We could introduce a separate variable like volumes_with_trash_purge and default it to ['volumes']. That at least makes a clear interface and allows the user to override which volumes will have it set. I'm open to other methods.

https://github.com/openstack-k8s-operators/ci-framework/blob/main/playbooks/ceph.yml#L273-L297

Yeah, that solution is feasible but more complicated (I'm thinking for long term maintenance).
I was thinking more about something like "tagging" the pool where we want to enable trashing:

cifmw_cephadm_pools: name: volumes pg_autoscale_mode: True target_size_ratio: 0.3 application: rbd trash_enabled: True

so we can keep iterating over the data structure, and enable the trash purge scheduler only for pools where we passed that trash_enabled boolean (that can be |default(false) for a general use case).
This should be simple enough and make that code less error prone.

I like your proposal of the tag more. Let's go for that.

@fmount Thanks for your suggestion, i will update the code accordingly.

roles/cifmw_cephadm/tasks/pools.yml

katarimanojk requested review from fultonj and fmount September 17, 2024 06:46

github-actions bot marked this pull request as draft September 17, 2024 06:46

openshift-ci bot added the do-not-merge/work-in-progress label Sep 17, 2024

Fix rbd pools trash purge scheduling

d8cee74

This patch ensures to use the right conditions and parameters while trash purge scheduling on the rbd pools. Jira: https://issues.redhat.com/browse/OSPRH-8504

katarimanojk force-pushed the purge_scheduler_task branch from cc42afd to d8cee74 Compare September 17, 2024 15:26

katarimanojk marked this pull request as ready for review September 17, 2024 15:27

openshift-ci bot removed the do-not-merge/work-in-progress label Sep 17, 2024

github-actions bot added the Ready For Review label Sep 17, 2024

fmount reviewed Sep 18, 2024

View reviewed changes

fultonj reviewed Sep 19, 2024

View reviewed changes

roles/cifmw_cephadm/tasks/pools.yml Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix rbd pools trash purge scheduling #2357

Fix rbd pools trash purge scheduling #2357

katarimanojk commented Sep 17, 2024

openshift-ci bot commented Sep 17, 2024

github-actions bot commented Sep 17, 2024

softwarefactory-project-zuul bot commented Sep 17, 2024

fmount Sep 18, 2024

fultonj Sep 19, 2024

fmount Sep 19, 2024

fmount Sep 18, 2024

katarimanojk Sep 19, 2024

fmount Sep 19, 2024

fultonj Sep 19, 2024

fmount Sep 18, 2024

fultonj Sep 19, 2024

fmount Sep 19, 2024

fultonj Sep 19, 2024

katarimanojk Sep 19, 2024

Fix rbd pools trash purge scheduling #2357

Are you sure you want to change the base?

Fix rbd pools trash purge scheduling #2357

Conversation

katarimanojk commented Sep 17, 2024

openshift-ci bot commented Sep 17, 2024

github-actions bot commented Sep 17, 2024

softwarefactory-project-zuul bot commented Sep 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment