Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context deadline exceeded,error when patching "/dev/shm/2383404947 #1571

Closed
hasanhakkaev opened this issue Jul 29, 2024 · 1 comment
Closed
Labels
enhancement New feature or request

Comments

@hasanhakkaev
Copy link

hasanhakkaev commented Jul 29, 2024

Description

We are happily running the policy-controller for over 600 namespaces. Currently are facing an issue with an ArgoCD app that has around 30 pods. After syncing we get the following errors:

Version
policy controller: v0.8.2
helm chart: 0.6.8
kubernetes version: v1.29.6-gke.1038001
KMS used: Vault

Policy-controller running with 2 replicas and we don't see any issues with it resource wise.

ArgoCD error

one or more objects failed to apply, reason: error when patching "/dev/shm/2179513921": Internal error occurred: failed calling webhook "policy.sigstore.dev": failed to call webhook: Post "https://webhook.cosign-system.svc:443/validations?timeout=10s": context deadline exceeded,error when patching "/dev/shm/4244754874": Internal error occurred: failed calling webhook "policy.sigstore.dev": failed to call webhook: Post "https://webhook.cosign-system.svc:443/validations?timeout=10s": context deadline exceeded,error when patching "/dev/shm/2383404947": Internal error occurred: failed calling webhook "policy.sigstore.dev": failed to call webhook: Post "https://webhook.cosign-system.svc:443/validations?timeout=10s": context deadline exceeded,error when patching "/dev/shm/1247502162": Internal error occurred: failed calling webhook "policy.sigstore.dev": failed to call webhook: Post "https://webhook.cosign-system.svc:443/validations?timeout=10s": context deadline exceeded,error when patching "/dev/shm/1156143926": Internal error occurred: failed calling webhook "policy.sigstore.dev": failed to call webhook: Post "https://webhook.cosign-system.svc:443/validations?timeout=10s": context deadline exceeded,error when patching "/dev/shm/302217473": Internal error occurred: failed calling webhook "policy.sigstore.dev": failed to call webhook: Post "https://webhook.cosign-system.svc:443/validations?timeout=10s": context deadline exceeded,error when patching "/dev/shm/2355632095": Internal error occurred: failed calling webhook "policy.sigstore.dev": failed to call webhook: Post "https://webhook.cosign-system.svc:443/validations?timeout=10s": context deadline exceeded,error when patching "/dev/shm/634602705": Internal error occurred: failed calling webhook "policy.sigstore.dev": failed to call webhook: Post "https://webhook.cosign-system.svc:443/validations?timeout=10s": context deadline exceeded,error when patching "/dev/shm/3366411207": Internal error occurred: failed calling webhook "policy.sigstore.dev": failed to call webhook: Post "https://webhook.cosign-system.svc:443/validations?timeout=10s": context deadline exceeded (retried 5 times).

Policy controller logs

{"level":"error","ts":"2024-07-29T08:13:02.441Z","logger":"policy-controller","caller":"webhook/validation.go:47","msg":"error validating signatures: Get \"<redacted_container_registry>\": context canceled","commit":"3c52aec-dirty","knative.dev/kind":"apps/v1, Kind=Deployment","knative.dev/namespace":"redacted-namespace","knative.dev/name":"redacted","knative.dev/operation":"UPDATE","knative.dev/resource":"apps/v1, Resource=deployments","knative.dev/subresource":"","knative.dev/userinfo":"system:serviceaccount:argocd:argocd-application-controller","stacktrace":"github.com/sigstore/policy-controller/pkg/webhook.valid\n\tgithub.com/sigstore/policy-controller/pkg/webhook/validation.go:47\ngithub.com/sigstore/policy-controller/pkg/webhook.ValidatePolicySignaturesForAuthority\n\tgithub.com/sigstore/policy-controller/pkg/webhook/validator.go:785\ngithub.com/sigstore/policy-controller/pkg/webhook.ValidatePolicy.func1\n\tgithub.com/sigstore/policy-controller/pkg/webhook/validator.go:531"}

{"level":"warn","ts":"2024-07-29T08:13:02.441Z","logger":"policy-controller","caller":"webhook/validator.go:1156","msg":"Failed to validate at least one policy for <redacted_image_name>@sha256:ad2167ad0083d5b272c30877669dcfa816a432123df3f0c411c248e1a1f746b4 wanted 1 policies, only validated 0","commit":"3c52aec-dirty","knative.dev/kind":"apps/v1, Kind=Deployment","knative.dev/namespace":"redacted-namespace","knative.dev/name":"redacted","knative.dev/operation":"UPDATE","knative.dev/resource":"apps/v1, Resource=deployments","knative.dev/subresource":"","knative.dev/userinfo":"system:serviceaccount:argocd:argocd-application-controller"}

{"level":"error","ts":"2024-07-29T08:13:02.441Z","logger":"policy-controller","caller":"validation/validation_admit.go:183","msg":"Failed the resource specific validation","commit":"3c52aec-dirty","knative.dev/kind":"apps/v1, Kind=Deployment","knative.dev/namespace":"redacted-namespace","knative.dev/name":"redacted","knative.dev/operation":"UPDATE","knative.dev/resource":"apps/v1, Resource=deployments","knative.dev/subresource":"","knative.dev/userinfo":"system:serviceaccount:argocd:argocd-application-controller","stacktrace":"knative.dev/pkg/webhook/resourcesemantics/validation.validate\n\tknative.dev/[email protected]/webhook/resourcesemantics/validation/validation_admit.go:183\nknative.dev/pkg/webhook/resourcesemantics/validation.(*reconciler).Admit\n\tknative.dev/[email protected]/webhook/resourcesemantics/validation/validation_admit.go:79\nknative.dev/pkg/webhook.admissionHandler.func1\n\tknative.dev/[email protected]/webhook/admission.go:123\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2122\nnet/http.(*ServeMux).ServeHTTP\n\tnet/http/server.go:2500\nknative.dev/pkg/webhook.(*Webhook).ServeHTTP\n\tknative.dev/[email protected]/webhook/webhook.go:302\nknative.dev/pkg/network/handlers.(*Drainer).ServeHTTP\n\tknative.dev/[email protected]/network/handlers/drain.go:113\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2936\nnet/http.(*conn).serve\n\tnet/http/server.go:1995"}

{"level":"info","ts":"2024-07-29T08:13:02.441Z","logger":"policy-controller","caller":"webhook/admission.go:151","msg":"remote admission controller audit annotations=map[string]string(nil)","commit":"3c52aec-dirty","knative.dev/kind":"apps/v1, Kind=Deployment","knative.dev/namespace":"redacted-namespace","knative.dev/name":"redacted","knative.dev/operation":"UPDATE","knative.dev/resource":"apps/v1, Resource=deployments","knative.dev/subresource":"","knative.dev/userinfo":"system:serviceaccount:argocd:argocd-application-controller","admissionreview/uid":"6cd0df49-5095-42b4-8393-4abbe03ebace","admissionreview/allowed":false,"admissionreview/result":"&Status{ListMeta:ListMeta{SelfLink:,ResourceVersion:,Continue:,RemainingItemCount:nil,},Status:Failure,Message:validation failed: context was canceled before validation completed: ,Reason:BadRequest,Details:nil,Code:400,}"}

@hectorj2f would you please take a look. Sorry for directly pinging you.
I've seen your comment on #952 and since the OP didn't provide any I thought would be okay to ping.

@hasanhakkaev hasanhakkaev added the enhancement New feature or request label Jul 29, 2024
@hasanhakkaev
Copy link
Author

After going through Slack threads, it kind of seems to be related to the timeout of the webhook.
I've increased it from 10s to 30s. The slack threads : https://sigstore.slack.com/archives/C03096V09F1/p1657724579691239 and https://sigstore.slack.com/archives/C03096V09F1/p1695225677749929?thread_ts=1695217924.133089&cid=C03096V09F1

@hasanhakkaev hasanhakkaev closed this as not planned Won't fix, can't repro, duplicate, stale Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant