Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: make PodTemplateSpec more robust and less error prone #255

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

pmalek
Copy link
Member

@pmalek pmalek commented May 10, 2024

What this PR does / why we need it:

This PR makes strategic merge patch more robust and less error prone by preventing several problematic use cases:

  • users needing to restate the base manifest (generated by KGO) which is not known to them before hand.
    With this PR, this won't be necessary anymore as. Users won't need to provide elements form base manifests.

    So this:

    apiVersion: gateway-operator.konghq.com/v1beta1
    kind: DataPlane
    metadata:
      name: konnect-example
    spec:
      deployment:
        podTemplateSpec:
          spec:
            containers:
            - name: proxy
              image: kong/kong-gateway:3.6
              env:
                - name: KONG_ROLE
                  value: data_plane
                ...
              volumeMounts:
              # We need to specify the cluster-certificate volume-mount because otherwise
              # strategic merge patch would merge its entry with provided
              # konnect-client-tls volume mount.
              - name: cluster-certificate
                mountPath: /var/cluster-certificate
              - name: konnect-client-tls
                mountPath: /etc/secrets/kong-cluster-cert/
                readOnly: true
            volumes:
            - name: cluster-certificate
            - name: konnect-client-tls
              secret:
                secretName: konnect-client-tls
    

    becomes:

    apiVersion: gateway-operator.konghq.com/v1beta1
    kind: DataPlane
    metadata:
      name: konnect-example
    spec:
      deployment:
        podTemplateSpec:
          spec:
            containers:
            - name: proxy
              image: kong/kong-gateway:3.6
              env:
                - name: KONG_ROLE
                  value: data_plane
                ...
              volumeMounts:
              - name: konnect-client-tls
                mountPath: /etc/secrets/kong-cluster-cert/
                readOnly: true
            volumes:
            - name: konnect-client-tls
              secret:
                secretName: konnect-client-tls
    
  • using Go API users could experience merging mutually exclusive fields e.g. when setting env vars:

    {
    Name: "add envs with fieldRef",
    Patch: &corev1.PodTemplateSpec{
    Spec: corev1.PodSpec{
    Containers: []corev1.Container{
    {
    Name: consts.ControlPlaneControllerContainerName,
    Env: []corev1.EnvVar{
    {
    Name: "LIMIT",
    ValueFrom: &corev1.EnvVarSource{
    ResourceFieldRef: &corev1.ResourceFieldSelector{
    Resource: "limits.cpu",
    },
    },
    },
    },
    },
    },
    },
    },
    Expected: func() corev1.PodTemplateSpec {
    d, err := makeControlPlaneDeployment()
    require.NoError(t, err)
    d.Spec.Template.Spec.Containers[0].Env = []corev1.EnvVar{
    {
    Name: "LIMIT",
    // NOTE: this is an artifact of the strategic merge patch at work.
    // This values comes from the first entry in the base patch.
    // In order to overcome this we need to re-state the base values
    // in the slice.
    Value: "VALUE1",
    ValueFrom: &corev1.EnvVarSource{
    ResourceFieldRef: &corev1.ResourceFieldSelector{
    Resource: "limits.cpu",
    Divisor: *resource.NewQuantity(1, resource.DecimalSI),
    },
    },
    },
    {
    Name: "ENV1",
    Value: "VALUE1",
    },
    {
    Name: "ENV2",
    Value: "VALUE2",
    },
    {
    Name: "ENV3",
    Value: "VALUE3",
    },
    }
    return d.Spec.Template
    },
    },

    This PR fixes this as well. Users are not required anymore to restate the elements from the base manifest array as in e.g.
    {
    Name: "add envs with fieldRef re-stating the base",
    Patch: &corev1.PodTemplateSpec{
    Spec: corev1.PodSpec{
    Containers: []corev1.Container{
    {
    Name: consts.ControlPlaneControllerContainerName,
    Env: []corev1.EnvVar{
    {
    Name: "ENV1",
    Value: "VALUE1",
    },
    {
    Name: "ENV2",
    Value: "VALUE2",
    },
    {
    Name: "ENV3",
    Value: "VALUE3",
    },
    // We're moving the `LIMIT` entry from the 1st to the 4th place in the slice to make sure it will not end up with both `Value` and `ValueFrom` fields.
    {
    Name: "LIMIT",
    ValueFrom: &corev1.EnvVarSource{
    ResourceFieldRef: &corev1.ResourceFieldSelector{
    Resource: "limits.cpu",
    },
    },
    },
    },
    },
    },
    },
    },
    Expected: func() corev1.PodTemplateSpec {
    d, err := makeControlPlaneDeployment()
    require.NoError(t, err)
    d.Spec.Template.Spec.Containers[0].Env = []corev1.EnvVar{
    {
    Name: "ENV1",
    Value: "VALUE1",
    },
    {
    Name: "ENV2",
    Value: "VALUE2",
    },
    {
    Name: "ENV3",
    Value: "VALUE3",
    },
    {
    Name: "LIMIT",
    ValueFrom: &corev1.EnvVarSource{
    ResourceFieldRef: &corev1.ResourceFieldSelector{
    Resource: "limits.cpu",
    Divisor: *resource.NewQuantity(1, resource.DecimalSI),
    },
    },
    },
    }
    return d.Spec.Template
    },

  • this PR additionally catches previously uncaught errors like setting not fully spec's readinessProbe on a sidecar:

    readinessProbe:
    initialDelaySeconds: 1
    periodSeconds: 1
    .
    This now properly yields an error and is not silenced:

    2024-05-10T15:00:17.681+0200 - ERROR - Reconciler error - {"controller": "dataplane", "controllerGroup": "gateway-operator.konghq.com", "controllerKind": "DataPlane", "DataPlane": {"name":"sidecar-example","namespace":"default"}, "namespace": "default", "name": "sidecar-example", "reconcileID": "ccf8917a-c101-479e-8b2b-c449fd51e743", "error": "could not build Deployment for DataPlane default/sidecar-example: failed creating Deployment for DataPlane sidecar-example: Deployment.apps \"dataplane-sidecar-example-4mxqs\" is invalid: spec.template.spec.containers[0].readinessProbe: Required value: must specify a handler type"}
    

Which issue this PR fixes

Related #128

Special notes for your reviewer:

Prior work: https://github.com/Kong/gateway-operator-archive/pull/1489/

PR Readiness Checklist:

Complete these before marking the PR as ready to review:

  • the CHANGELOG.md release notes have been updated to reflect significant changes

@pmalek pmalek self-assigned this May 10, 2024
@pmalek pmalek force-pushed the strategic-merge-patch-revisited branch from ed2d367 to 3ec82f6 Compare May 10, 2024 13:01
@lahabana lahabana added this to the KGO v1.4.x milestone May 14, 2024
Copy link
Member

@programmer04 programmer04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about?

// TODO https://github.com/Kong/gateway-operator/issues/128
// This is a a workaround to avoid patches clobbering the wrong EnvVar. We want to find an improved patch mechanism
// that doesn't clobber EnvVars (and other array fields) it shouldn't.
existingEnvVars := desiredDeployment.Spec.Template.Spec.Containers[0].Env
desiredDeployment.Spec.Template.Spec.Containers[0].Env = []corev1.EnvVar{}
// apply user patches and set any default environment variables that aren't already set
desiredDeployment, err = applyDeploymentUserPatchesForDataPlane(dataplane, desiredDeployment)
if err != nil {
return nil, op.Noop, err
}
// apply default envvars and restore the hacked-out ones
desiredDeployment = applyEnvForDataPlane(existingEnvVars, desiredDeployment)

@pmalek pmalek modified the milestones: KGO v1.4.x, KGO v1.5.x Sep 26, 2024
@pmalek
Copy link
Member Author

pmalek commented Sep 26, 2024

Moving this out of 1.4 as we have other priorities to complete for that milestone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants