-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pods stuck in "ContainerCreating" status in AKS: FailedCreatePodSandBox #11478
Comments
Thanks for the detailed report 💯 |
I am getting the same error on one of my linkerd-cni pods:
Deleting the pod does not resolve this issue for me.
linkerd versions:
I am not on AKS. |
@alpeb I have similar issue on EKS Warning FailedCreatePodSandBox 4m20s (x90059 over 13d) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "f4df221c93495b1b811911c8a9f371b9483102e8fe2d3c154c51c5d036d11de7": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Unauthorized Currently the temporary workaround is I just recycle the Possibly race condition where mentioned by #10738. Is the race condition fixed in version |
Please try on a more recent Linkerd. 2.14.8 is the most recent. See support policy section in https://linkerd.io/releases/#stable-latest-version-stable-2148 |
We have not experienced this issue in a while. |
What is the issue?
When we do deployment updates, sometimes our pods will randomly stop finishing creation.
New pod is created and stuck in "ContainerCreating". Pod is not even enabled with linkerd. linkerd annotation is not enabled.
We pre-install linkerd in CNI mode on all our clusters, but some teams don't use it. They will still run into this issue.
How can it be reproduced?
Unknown. Seems to happen randomly on different nodes.
Logs, error output, etc
Describe pod:
https://gist.github.com/oskarm93/335679f5abfc6b0f6c8da198c71f6db9
Linkerd CNI logs (node 09):
https://gist.github.com/oskarm93/3e67a6ff935c55fdb0b42e0c190281d7
Linkerd CNI describe pod (node 09):
https://gist.github.com/oskarm93/b93dbdd4c1977c08514067abdfbf9bc5
output of
linkerd check -o short
Environment
Kubernetes version: 1.26.6
Environment: AKS
OS: AKSUbuntu-2204gen2containerd-202307.27.0
Linkerd Version:
Possible solution
Restarting CNI pod on the node where pod was going to start usually solves the problem.
Additional context
No response
Would you like to work on fixing this bug?
None
The text was updated successfully, but these errors were encountered: