-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(diagnostics) add diff endpoints #6131
base: main
Are you sure you want to change the base?
Conversation
Moving into 3.3 as a stretch. IIRC this should be 50% done, just dropped from 3.2 between time constraints and low priority. |
02c432c
to
a5e9832
Compare
Rename the diagnostics.ConfigDump struct to ClientDiagnostic to reflect that it now handles multiple types of diagnostic data (config dumps, config diffs, and fallback status) from the Kong client.
Add a struct to hold diff history in the diagnostic server. It holds a fixed number of diffs (5, arbitrarily) and provides functions to update and view history by config hash. Add a diagnostics function to generate diagnostic diff structs from GDR event data. Add a diagnostics update loop case to handle inbound diffs. Call the diagnostic diff generator when processing an event and append the diagnostic diff to the diff list to the DB mode event handler.
Dump sensitive config during integration, mostly to support debugging the diff endpoint. While there's no diff diagnostic integration test, having it available allows for easier interactive debugging of the endpoint, or viewing diff states when running DB-backed tests.
Add timestamps and a list of available diffs to the diff endpoint. Add unit tests for the diff endpoint.
Remove scaffolding comments and unfinished wishlist items. Add missing doc comments. Clean up some garbage the linter caught.
Fix one bug, find another: This has somehow broken the envtest for DB mode error event generation. That test should normally function by creating a consumer and tossing it at a mock admin API that always returns an error, which should in turn make GDR bubble up an error event through the result channel, which KIC will then turn into a Kubernetes Event. Looking at the event handler only shows yet more of the wonderful 👻 events described in Kong/go-database-reconciler#120, though unfortunately I can't get the debugger to stop anywhere in the GDR portion, even at the low level of setting a watchpoint on the channel buffer itself. IDK what the heck's going there, since that's not something the envtest mock stuff should skip. Tried to pin down the problem to either this or the GDR change:
so whatever causes it appears to be in this PR's new code, but IDK what. The new skip phantom events case isn't on the error handling side and shouldn't affect it, and |
What this PR does / why we need it:
Adds a diff endpoint to the diagnostic server and sets up infrastructure for DB mode updates to ship their diffs to the diagnostic server. This endpoint is available when sensitive config dumps are enabled, as it includes diffs of sensitive resources. We could add filters to exclude certs/passwords/etc. while retaining other items, but for the initial MVP, easiest to just require an opt in to sensitive diagnostic output.
Which issue this PR fixes:
Fix #5289
Special notes for your reviewer:
This requires Kong/go-database-reconciler#119. My original work in GDR was only sending error events; it should have been sending everything. This currently has a tmp commit against to update the module to the branch. That GDR PR needs to merged and tagged as fix release before merging this.
Per Kong/go-database-reconciler#120 I observed an odd bug with mysterious empty events 👻 Dunno where these are from, but trying to figure that out will take more time than is available. As a workaround, the event handler on the KIC side discards any event with no action, as such events should never be of interest.
Manual testing via running
TEST_DATABASE_MODE=postgres KONG_TEST_CLUSTER=kind:kind GOFLAGS= dlv test -- -test.run SomeTestThatModifiesConfig
. Almost any test will do, since they almost all modify config. You can either set up awatch curl -sv localhost:10256/debug/config/diff-report
to see it over time or debugger pause after the diff send to see an individual diff. Output looks something like example.txt.Unit testing isn't exhaustive, but covers most of the endpoint functionality. It checks the KIC-side behavior, not content, with the expectation that if content's broken, that's GDR's problem.
PR Readiness Checklist:
Complete these before marking the PR as
ready to review
:CHANGELOG.md
release notes have been updated to reflect any significant (and particularly user-facing) changes introduced by this PR