Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(adminapi): support specification of http/https via appProtocol on the admin port #5251

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

backjo
Copy link
Contributor

@backjo backjo commented Nov 29, 2023

What this PR does / why we need it:
This PR allows for users to indicate that HTTP may be used to communicate with the Admin API instead of HTTPS. While it is generally a best practice to use HTTPS, some clients may be leveraging Service Meshes like Kuma or Istio to manage transparent mutual TLS connections between workloads (as seen in #4698).

This PR allows the operator to indicate via the admin port appProtocol that they wish to use HTTP.

Which issue this PR fixes:

Related to #4698
Special notes for your reviewer:

PR Readiness Checklist:

Complete these before marking the PR as ready to review:

  • the CHANGELOG.md release notes have been updated to reflect any significant (and particularly user-facing) changes introduced by this PR

@backjo backjo marked this pull request as ready for review November 29, 2023 08:54
@backjo backjo requested a review from a team as a code owner November 29, 2023 08:54
Copy link
Member

@mlavacca mlavacca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR, @backjo!

Can you explain what's the use case for this change, since #4942 already addressed this issue by excluding admin API traffic from the mesh?

@czeslavo
Copy link
Contributor

It seems to me like a good alternative. As I understand the benefit is that users using any mesh will be able to use all mesh features with that in place in comms between KIC and Kong, which is not possible if we exclude Admin API from meshes with the use of annotations.

BTW if we'd like to merge this, we'd have to make it possible to not add the excluding annotations in Helm Deployments (I believe they're hardcoded now).

cc @rainest as the author of #4942.

@backjo
Copy link
Contributor Author

backjo commented Nov 29, 2023

Yeah, it's along the lines of what @czeslavo is saying. As an operator, I would like to leverage my mesh for the various benefits - mutual TLS, observability etc - that it provides. #4942 allows Gateway Discovery to work in a mesh environment, but it does that by just excluding the traffic from the mesh. This PR would allow me as an operator to have gateway discovery work AND allow me to keep the traffic inside of the mesh.

@mlavacca
Copy link
Member

Yeah, it's along the lines of what @czeslavo is saying. As an operator, I would like to leverage my mesh for the various benefits - mutual TLS, observability etc - that it provides. #4942 allows Gateway Discovery to work in a mesh environment, but it does that by just excluding the traffic from the mesh. This PR would allow me as an operator to have gateway discovery work AND allow me to keep the traffic inside of the mesh.

Ok, I think it makes sense. Will defer to @rainest as he is the original author of #4942.

@rainest
Copy link
Contributor

rainest commented Nov 30, 2023

tl;dr I don't think we should handle this by allowing HTTP; we should figure out what we can disable to make HTTPS viable instead.

AFAIK the only reason for exclusion is because we do our own mTLS and the client certificate can't traverse the hops. Although meshes provide their own methods we can use for validation, they're not guaranteed consistent across all meshes.

Unfortunately our research in #4698 didn't dig too deep into the connection to confirm that was indeed the issue, and there's some oddness in the report, namely that it doesn't actually enable mTLS AFAICT (the ingress chart does not set admin.tls.client or the ingressController.adminApi.tls settings by default).

HTTP effectively removes that concern, but is still unfortunately HTTP and removes everything else that HTTPS provides, namely the ability to verify upstream. I think we'd want to instead figure out a way to get HTTPS to work, even if that means disabling parts of it without full verification--although the meshes (at least Kuma, and presumably other Envoy-based ones) can effectively guarantee that only they can populate their client cert info header, we'd unfortunately need to verify it gateway-side, and AFAIK have no way to do this.

@czeslavo
Copy link
Contributor

Wouldn't it make sense to offload mTLS to the Mesh if a user already uses that? As I understand, both Kuma and Istio allow configuring mTLS which would be a preferable way to authenticate between services in such an environment, allowing not to worry about another custom layer of authentication (which our mTLS kinda is).

Although meshes provide their own methods we can use for validation, they're not guaranteed consistent across all meshes.

Is that an issue? If an operator decides to implement mTLS on their own (by any means, Mesh in our case), why not allow them to do so by consciously committing to plain-text communication? We would not distribute this config by default in charts, but we may document that's possible, with a note for users to make sure they know what they're doing.

@backjo
Copy link
Contributor Author

backjo commented Dec 1, 2023

Generally a +1 to @czeslavo's comment - as an operator our preferred way to authenticate workloads is via the mesh that we are using.

I think the issue in #4698 was indeed around double TLS (notably, not double mTLS). We had the same problem with Istio and GD - a lot of meshes expect that traffic coming into the sidecar is in plaintext and that the mesh will be the one responsible for the TLS component (https://istio.io/latest/docs/ops/common-problems/network-issues/#double-tls).

Is that an issue? If an operator decides to implement mTLS on their own (by any means, Mesh in our case), why not allow them to do so by consciously committing to plain-text communication? We would not distribute this config by default in charts, but we may document that's possible, with a note for users to make sure they know what they're doing.

This exactly. As an operator, I understand that enabling plain text generically is a bad practice, but I would be turning plaintext on knowing that my infrastructure is already handling the necessary security (which may even be superior security-wise to the built in (m)TLS support as issuing/rotation/revocation is much more automated in most service meshes than it would be for the controller specifically)

internal/adminapi/endpoints.go Outdated Show resolved Hide resolved
internal/adminapi/endpoints.go Outdated Show resolved Hide resolved
@rainest
Copy link
Contributor

rainest commented Dec 7, 2023

#4698 as originally presented did not actually result in any issues. We wanted to handle controller-initiated mTLS in mesh environments (because we want that to eventually be the default), but that's separate from the admin API and discovery client using HTTPS. The original repro steps unfortunately appear incorrect, and I'm not sure what was missed there.

@backjo did you have a set of configurations that did indeed not work if you removed the exemptions? The example in https://istio.io/latest/docs/ops/common-problems/network-issues/#double-tls looks like it would cause issues because the endpoint in question (httpbin.org) is external, and wouldn't have its own corresponding mesh sidecar to decrypt the mesh-level TLS. If both sides of the connection have a mesh sidecar, that should be fine.

HTTP-level metrics would probably be unavailable--it's not clear if Istio has a client-side HTTPS proxy within the mesh that can terminate, encrypt between sidecars, and then encrypt again to the destination (searches re termination are mostly bringing up docs related to its external gateway functionality).

@backjo
Copy link
Contributor Author

backjo commented Dec 11, 2023

@rainest checking here, will circle back

@rainest
Copy link
Contributor

rainest commented Jan 8, 2024

@backjo hey again, checking back on this, had you heard anything from your end?

@backjo
Copy link
Contributor Author

backjo commented Jan 8, 2024

@rainest checked quite a few ways with our mesh configuration - ultimately, we have not been able to get the mesh authorization working with double TLS. I think the best path forward here would still be to allow users the flexibility of using HTTP if they opt-in, given that telemetry benefits from a service mesh would come along with that for free as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants