-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stable 2.14.10 #12111
stable 2.14.10 #12111
Commits on Feb 17, 2024
-
Don't remove a Namespace Index while it still contains AuthorizationP…
…olicy (#12088) In the inbound policy index, we maintain a policy index per namespace which holds various policy resources for that namespace. When a per namespace index becomes empty, we remove it. However, we were not considering authorization policy resources when determining if the index is empty. This could result in the index being removed even while it contained authorization policy resources, as long as all other resource types did not exist. This can lead to incorrect inbound policy responses when the per namespace index is recreated, since it will not longer contain the authorization policy. We update the `is_empty()` function to properly consider authorization policies as well. We also add some generally useful logging at debug and trace level. Signed-off-by: Alex Leong <[email protected]> Co-authored-by: Oliver Gould <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2c763fa - Browse repository at this point
Copy the full SHA 2c763faView commit details -
Fix race condition in Destination's endpoints watcher (#12022)
Fixes #12010 ## Problem We're observing crashes in the destination controller in some scenarios, due to data race as described in #12010. ## Cause The problem is the same instance of the `AddressSet.Addresses` map is getting mutated in the endpoints watcher Server [informer handler](https://github.com/linkerd/linkerd2/blob/edge-24.1.3/controller/api/destination/watcher/endpoints_watcher.go#L1309), and iterated over in the endpoint translator [queue loop](https://github.com/linkerd/linkerd2/blob/edge-24.1.3/controller/api/destination/endpoint_translator.go#L197-L211), which run in different goroutines and the map is not guarded. I believe this doesn't result in Destination returning stale data; it's more of a correctness issue. ## Solution Make a shallow copy of `pp.addresses` in the endpoints watcher and only pass that to the listeners. It's a shallow copy because we avoid making copies of the pod reference in there, knowing it won't get mutated. ## Repro Install linkerd core and injected emojivoto and patch the endpoint translator to include a sleep call that will help surfacing the race (don't install the patch in the cluster; we'll only use it locally below): <details> <summary>endpoint_translator.go diff</summary> ```diff diff --git a/controller/api/destination/endpoint_translator.go b/controller/api/destination/endpoint_translator.go index d1018d5f9..7d5abd638 100644 --- a/controller/api/destination/endpoint_translator.go +++ b/controller/api/destination/endpoint_translator.go @@ -5,6 +5,7 @@ import ( "reflect" "strconv" "strings" + "time" pb "github.com/linkerd/linkerd2-proxy-api/go/destination" "github.com/linkerd/linkerd2-proxy-api/go/net" @@ -195,7 +196,9 @@ func (et *endpointTranslator) processUpdate(update interface{}) { } func (et *endpointTranslator) add(set watcher.AddressSet) { for id, address := range set.Addresses { + time.Sleep(1 * time.Second) et.availableEndpoints.Addresses[id] = address } ``` </details> Then create these two Server manifests: <details> <summary>emoji-web-server.yml</summary> ```yaml apiVersion: policy.linkerd.io/v1beta2 kind: Server metadata: namespace: emojivoto name: web-http labels: app.kubernetes.io/part-of: emojivoto app.kubernetes.io/name: web app.kubernetes.io/version: v11 spec: podSelector: matchLabels: app: web-svc port: http proxyProtocol: HTTP/1 ``` </details> <details> <summary>emoji-web-server-opaque.yml</summary> ```yaml apiVersion: policy.linkerd.io/v1beta2 kind: Server metadata: namespace: emojivoto name: web-http labels: app.kubernetes.io/part-of: emojivoto app.kubernetes.io/name: web app.kubernetes.io/version: v11 spec: podSelector: matchLabels: app: web-svc port: http proxyProtocol: opaque ``` </details> In separate consoles run the patched destination service and a destination client: ```bash HOSTNAME=foobar go run -race ./controller/cmd/main.go destination -enable-h2-upgrade=true -enable-endpoint-slices=true -cluster-domain=cluster.local -identity-trust-domain=cluster.local -default-opaque-ports=25,587,3306,4444,5432,6379,9300,11211 ``` ```bash go run ./controller/script/destination-client -path web-svc.emojivoto.svc.cluster.local:80 ``` And run this to continuously switch the `proxyProtocol` field: ```bash while true; do kubectl apply -f ~/src/k8s/sample_yamls/emoji-web-server.yml; kubectl apply -f ~/src/k8s/sample_yamls/emoji-web-server-opaque.yml ; done ``` You'll see the following data race report in the Destination controller logs: <details> <summary>destination logs</summary> ```console ================== WARNING: DATA RACE Write at 0x00c0006d30e0 by goroutine 178: github.com/linkerd/linkerd2/controller/api/destination/watcher.(*portPublisher).updateServer() /home/alpeb/pr/destination-race/linkerd2/controller/api/destination/watcher/endpoints_watcher.go:1310 +0x772 github.com/linkerd/linkerd2/controller/api/destination/watcher.(*servicePublisher).updateServer() /home/alpeb/pr/destination-race/linkerd2/controller/api/destination/watcher/endpoints_watcher.go:711 +0x150 github.com/linkerd/linkerd2/controller/api/destination/watcher.(*EndpointsWatcher).addServer() /home/alpeb/pr/destination-race/linkerd2/controller/api/destination/watcher/endpoints_watcher.go:514 +0x173 github.com/linkerd/linkerd2/controller/api/destination/watcher.(*EndpointsWatcher).updateServer() /home/alpeb/pr/destination-race/linkerd2/controller/api/destination/watcher/endpoints_watcher.go:528 +0x26f github.com/linkerd/linkerd2/controller/api/destination/watcher.(*EndpointsWatcher).updateServer-fm() <autogenerated>:1 +0x64 k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnUpdate() /home/alpeb/go/pkg/mod/k8s.io/[email protected]/tools/cache/controller.go:246 +0x81 k8s.io/client-go/tools/cache.(*ResourceEventHandlerFuncs).OnUpdate() <autogenerated>:1 +0x1f k8s.io/client-go/tools/cache.(*processorListener).run.func1() /home/alpeb/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:970 +0x1f4 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1() /home/alpeb/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:226 +0x41 k8s.io/apimachinery/pkg/util/wait.BackoffUntil() /home/alpeb/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:227 +0xbe k8s.io/apimachinery/pkg/util/wait.JitterUntil() /home/alpeb/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:204 +0x10a k8s.io/apimachinery/pkg/util/wait.Until() /home/alpeb/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:161 +0x9b k8s.io/client-go/tools/cache.(*processorListener).run() /home/alpeb/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:966 +0x38 k8s.io/client-go/tools/cache.(*processorListener).run-fm() <autogenerated>:1 +0x33 k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1() /home/alpeb/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:72 +0x86 Previous read at 0x00c0006d30e0 by goroutine 360: github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).add() /home/alpeb/pr/destination-race/linkerd2/controller/api/destination/endpoint_translator.go:200 +0x1ab github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).processUpdate() /home/alpeb/pr/destination-race/linkerd2/controller/api/destination/endpoint_translator.go:190 +0x166 github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).Start.func1() /home/alpeb/pr/destination-race/linkerd2/controller/api/destination/endpoint_translator.go:174 +0x45 ``` </details> ## Extras This also removes the unused method `func (as *AddressSet) WithPort(port Port) AddressSet` in endpoints_watcher.go
Configuration menu - View commit details
-
Copy full SHA for 53b5702 - Browse repository at this point
Copy the full SHA 53b5702View commit details -
Reflect changes to server selector in opaqueness (#12031)
Fixes #11995 If a Server is marking a Pod's port as opaque and then the Server's podSelector is updated to no longer select that Pod, then the Pod's port should no longer be marked as opaque. However, this update does not result in any updates from the destination API's Get stream and the port remains marked as opaque. We fix this by updating the endpoint watcher's handling of Server updates to consider both the old and the new Server. Signed-off-by: Alex Leong <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5fbca8a - Browse repository at this point
Copy the full SHA 5fbca8aView commit details -
Return INVALID_ARGUMENT status codes properly from GetProfile (#11980)
When the destination controller receives a GetProfile request for an ExternalName service, it should return gRPC status code INVALID_ARGUMENT to signal to the proxy that ExternalName services do not have endpoints and service discovery should not be used. However, the destination controller is returning gRPC status code UNKNOWN instead. This causes the proxy to retry these requests, resulting in a flurry of warning logs in the destination controller. We fix the destination controller to properly return INVALID_ARGUMENT instead of UNKNOWN for ExternalName services. Signed-off-by: Alex Leong <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2779d11 - Browse repository at this point
Copy the full SHA 2779d11View commit details -
Templating prometheus url into cronjob heartbeat (#11343)
Can not use external prometheus with hearbeat. This change adds a new variable `.prometheusUrl` in value and use it into heartbeat from linkerd-control-plane chart. Fixes #11342 Signed-off-by: David ALEXANDRE <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 22a59ba - Browse repository at this point
Copy the full SHA 22a59baView commit details -
Helm linkerd-controler-plane Allow Custom labels on Pod Monitor (#11222)
Need to be able to set labels on Pod Monitors add a labels section to podMonitor Helm Lint/Helm Template Fixes #[11175] Signed-off-by: Justin S <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6d49f3b - Browse repository at this point
Copy the full SHA 6d49f3bView commit details -
Configuration menu - View commit details
-
Copy full SHA for e35896d - Browse repository at this point
Copy the full SHA e35896dView commit details
Commits on Feb 19, 2024
-
Only process server updates on workloads affected by the server (#12017)
We the destination controller's workload watcher receives an update for any Server resource, it recomputes opaqueness for every workload. This is because the Server update may have changed opaqueness for that workload. However, this is very CPU intensive for the destination controller, especially during resyncs when we get Server updates for every Server resource in the cluster. Instead, we only need to recompute opaqueness for workloads that are selected by the old version of the Server or by the new version of the Server. If a workload is not selected by either the new or old version of the Server, then the Server update cannot have changed the workload's opaqueness. Signed-off-by: Alex Leong <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 09adb81 - Browse repository at this point
Copy the full SHA 09adb81View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7867e55 - Browse repository at this point
Copy the full SHA 7867e55View commit details -
Configuration menu - View commit details
-
Copy full SHA for c23b490 - Browse repository at this point
Copy the full SHA c23b490View commit details -
Configuration menu - View commit details
-
Copy full SHA for e4b7416 - Browse repository at this point
Copy the full SHA e4b7416View commit details -
Configuration menu - View commit details
-
Copy full SHA for 41628bb - Browse repository at this point
Copy the full SHA 41628bbView commit details