Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix rbd pools trash purge scheduling #2357

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 7 additions & 5 deletions roles/cifmw_cephadm/tasks/pools.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,19 +47,21 @@

- name: Configure the RBD trash purge scheduler
when:
- cifmw_enabled_services | default([]) | intersect(['cinder_volume'])
- cifmw_cephadm_enable_trash_scheduler | default(false)
fultonj marked this conversation as resolved.
Show resolved Hide resolved
- cifmw_cephadm_pools is defined
- cifmw_cephadm_pools | length > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a | default([]) here ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these checks make | default([]) unnecessary. I think we specifically want to avoid this block if there are no cifmw_cephadm_pools so I think the update of the when is fine. I assume you were just proposing some alternative method for the when and not pointing out a logic problem. If there's a logic problem, then please elaborate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

block:
- name: Get the RBD ceph_cli
ansible.builtin.include_tasks: ceph_cli.yml
vars:
ceph_command: rbd

- name: Set trash interval
when: item.application == 'rbd'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fragile to me, because if by any chance application is not set in the data structure, this task fails w/ application not defined.
am I right? @katarimanojk not sure you can double check this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of the time people will use the hard coded cifmw_cephadm_pools and I think @katarimanojk was trying to set it for all RBD and exclude it for cephfs. However, I agree with @fmount that only want to set it for cinder, which will be called volumes. Unless someone overrides it, but in that case we'll have no way knowing which is the cinder volume unless we have some other sort of label. We could introduce a separate variable like volumes_with_trash_purge and default it to ['volumes']. That at least makes a clear interface and allows the user to override which volumes will have it set. I'm open to other methods.

https://github.com/openstack-k8s-operators/ci-framework/blob/main/playbooks/ceph.yml#L273-L297

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that solution is feasible but more complicated (I'm thinking for long term maintenance).
I was thinking more about something like "tagging" the pool where we want to enable trashing:

cifmw_cephadm_pools:
    name: volumes
    pg_autoscale_mode: True
    target_size_ratio: 0.3
    application: rbd
    trash_enabled: True

so we can keep iterating over the data structure, and enable the trash purge scheduler only for pools where we passed that trash_enabled boolean (that can be |default(false) for a general use case).
This should be simple enough and make that code less error prone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your proposal of the tag more. Let's go for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmount Thanks for your suggestion, i will update the code accordingly.

ansible.builtin.command:
cmd: |
{{ cifmw_cephadm_ceph_cli }} trash purge schedule add \
{{ cifmw_cephadm_rbd_trash_interval | default(15) }} --pool {{ item }}
cmd: >-
{{ cifmw_cephadm_ceph_cli }} trash purge schedule add
{{ cifmw_cephadm_rbd_trash_interval | default(15) }} --pool {{ item.name }}
changed_when: false
become: true
loop: "{{ [ cinder_pool.name | default('volumes') ] + cinder_pool.cinder_extra_pools | default([]) }}"
loop: "{{ cifmw_cephadm_pools | default([]) }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the goal to set the rbd trash purge scheduler to all the ceph pools w/ an rbd application set? Or the goal was to enabled it only for cinder pools?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, i think goal here is set it only for cinder pools

in tripleo, cinder_pool was defined in ceph_pools like this
ceph_pools:
cinder_pool:
name: 'volumes'
enabled: true
cinder_extra_pools: [altrbd, pool2, pool3]

in cifmw , we have

cifmw_cephadm_pools:

  • name: volumes
    pg_autoscale_mode: True
    target_size_ratio: 0.3
    application: rbd

@fultonj
so we have only one pool volumes for cinder, should we set the rbd trash purge scheduler only for that pool and remove the loop ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only set for volumes if is part of the cifmw_cephadm_pools (this should be the when condition). Right now this is wrong because you're setting the trash scheduler for all pools, and I don't think this was the original goal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to what fmount said.