Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix rbd pools trash purge scheduling #2357

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

katarimanojk
Copy link
Contributor

This patch ensures to use the right conditions
and parameters while trash purge scheduling on
the rbd pools.

Jira: https://issues.redhat.com/browse/OSPRH-8504

Copy link
Contributor

openshift-ci bot commented Sep 17, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign pablintino for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@github-actions github-actions bot marked this pull request as draft September 17, 2024 06:46
Copy link

Thanks for the PR! ❤️
I'm marking it as a draft, once your happy with it merging and the PR is passing CI, click the "Ready for review" button below.

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/c8ba8105a4a94ca8954091b9cd82167d

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 46m 40s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 16m 23s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 30m 57s
podified-multinode-hci-deployment-crc RETRY_LIMIT Host unreachable in 1h 38m 13s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 07s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 10s
✔️ build-push-container-cifmw-client SUCCESS in 31m 14s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 5m 08s

This patch ensures to use the right conditions
and parameters while trash purge scheduling on
the rbd pools.

Jira: https://issues.redhat.com/browse/OSPRH-8504
- cifmw_cephadm_enable_trash_scheduler | default(false)
- cifmw_cephadm_pools is defined
- cifmw_cephadm_pools | length > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a | default([]) here ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these checks make | default([]) unnecessary. I think we specifically want to avoid this block if there are no cifmw_cephadm_pools so I think the update of the when is fine. I assume you were just proposing some alternative method for the when and not pointing out a logic problem. If there's a logic problem, then please elaborate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

changed_when: false
become: true
loop: "{{ [ cinder_pool.name | default('volumes') ] + cinder_pool.cinder_extra_pools | default([]) }}"
loop: "{{ cifmw_cephadm_pools | default([]) }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the goal to set the rbd trash purge scheduler to all the ceph pools w/ an rbd application set? Or the goal was to enabled it only for cinder pools?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, i think goal here is set it only for cinder pools

in tripleo, cinder_pool was defined in ceph_pools like this
ceph_pools:
cinder_pool:
name: 'volumes'
enabled: true
cinder_extra_pools: [altrbd, pool2, pool3]

in cifmw , we have

cifmw_cephadm_pools:

  • name: volumes
    pg_autoscale_mode: True
    target_size_ratio: 0.3
    application: rbd

@fultonj
so we have only one pool volumes for cinder, should we set the rbd trash purge scheduler only for that pool and remove the loop ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only set for volumes if is part of the cifmw_cephadm_pools (this should be the when condition). Right now this is wrong because you're setting the trash scheduler for all pools, and I don't think this was the original goal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to what fmount said.

block:
- name: Get the RBD ceph_cli
ansible.builtin.include_tasks: ceph_cli.yml
vars:
ceph_command: rbd

- name: Set trash interval
when: item.application == 'rbd'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fragile to me, because if by any chance application is not set in the data structure, this task fails w/ application not defined.
am I right? @katarimanojk not sure you can double check this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of the time people will use the hard coded cifmw_cephadm_pools and I think @katarimanojk was trying to set it for all RBD and exclude it for cephfs. However, I agree with @fmount that only want to set it for cinder, which will be called volumes. Unless someone overrides it, but in that case we'll have no way knowing which is the cinder volume unless we have some other sort of label. We could introduce a separate variable like volumes_with_trash_purge and default it to ['volumes']. That at least makes a clear interface and allows the user to override which volumes will have it set. I'm open to other methods.

https://github.com/openstack-k8s-operators/ci-framework/blob/main/playbooks/ceph.yml#L273-L297

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that solution is feasible but more complicated (I'm thinking for long term maintenance).
I was thinking more about something like "tagging" the pool where we want to enable trashing:

cifmw_cephadm_pools:
    name: volumes
    pg_autoscale_mode: True
    target_size_ratio: 0.3
    application: rbd
    trash_enabled: True

so we can keep iterating over the data structure, and enable the trash purge scheduler only for pools where we passed that trash_enabled boolean (that can be |default(false) for a general use case).
This should be simple enough and make that code less error prone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your proposal of the tag more. Let's go for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmount Thanks for your suggestion, i will update the code accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants