Leaking kubeadmconfigtemplates, openstackmachinetemplates ... #105

garloff · 2024-06-11T17:43:13Z

/kind bug

What steps did you take and what happened:
A management cluster (kind) running in an SCS-2V-4 VM for 3 months (mostly idle) became unusable.
After some debugging, it was found that the kube-apiserver's memory usage had exploded to > 2GiB RSS.
This caused the machine to aggressively discard memory (kswapd0) just to hit major page faults resulting in the memory to be paged back in. System load > 50 (on a 2vCPU server), >>10k major page faults/s and >500MB/s reading from disk.

What did you expect to happen:
4GiB should be sufficient RAM for a not too busy management host.

Anything else you would like to add:
I was assuming that the CSO/CSPO are causing the kube-apiserver memory usage by storing too many objects.
I thus far found kubeadmconfigtemplates and clusterclasses to exist in excessive numbers.

Environment:

kind v0.20.0 go1.20.4 linux/amd64
Ubuntu 22.04 VM on an SCS-2V-4 flavor (2vCPU, 4GiB RAM, x86-64)
CSO/CSPO as of 93d ago (let me know how I can report this better)

garloff · 2024-06-11T17:47:15Z

13683 kubeadmconfigtemplates:

cluster2    capi-openstack-alpha-1-28                      93d
cluster4    capi-openstack-alpha-1-28                      93d
cluster4    cs-cluster4-capi-openstack-alpha-1-28-ljnkh    93d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-222ck   55d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-224lk   14d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-225pj   68d
[...]

garloff · 2024-06-11T18:03:48Z

15646 openstackmachinetemplates:

cluster2    capi-openstack-alpha-1-28                      94d
cluster2    capi-openstack-alpha-1-28-control-plane        94d
cluster4    capi-openstack-alpha-1-28                      93d
cluster4    capi-openstack-alpha-1-28-control-plane        93d
cluster4    cs-cluster4-capi-openstack-alpha-1-28-mmjrw    93d
cluster4    cs-cluster4-xlh9r                              93d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-226gt   87d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-226qq   76m
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-2275d   92d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-228jx   88d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-229r2   89d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-229vc   79d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-22b2t   30d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-22bm4   73d
cluster4    cs-cluster4a-capi-openstack-alpha-1-28-22cl4   84d
[...]

garloff · 2024-06-11T18:40:24Z

kubectl delete -n cluster4 kubeadmtemplate <LIST OF 13000 names> takes more than an hour, but seems to help memory usage. Same for openstackmachinetemplate. I also did compacting and defragmenting on etcd to recover.

garloff changed the title ~~Leaking kubeadmconfigtemplates, clusterclasses, ...~~ Leaking kubeadmconfigtemplates, ... Jun 11, 2024

garloff changed the title ~~Leaking kubeadmconfigtemplates, ...~~ Leaking kubeadmconfigtemplates, openstackmachinetemplates ... Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leaking kubeadmconfigtemplates, openstackmachinetemplates ... #105

Leaking kubeadmconfigtemplates, openstackmachinetemplates ... #105

garloff commented Jun 11, 2024

garloff commented Jun 11, 2024

garloff commented Jun 11, 2024

garloff commented Jun 11, 2024

Leaking kubeadmconfigtemplates, openstackmachinetemplates ... #105

Leaking kubeadmconfigtemplates, openstackmachinetemplates ... #105

Comments

garloff commented Jun 11, 2024

garloff commented Jun 11, 2024

garloff commented Jun 11, 2024

garloff commented Jun 11, 2024