Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

karpenter addon (v1.0.2): has misconfigured CRDs and installCRDs: false, doesn't work. #1078

Open
neoakris opened this issue Sep 20, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@neoakris
Copy link

neoakris commented Sep 20, 2024

Describe the bug

After seeing a helm chart release v1.0.2 on karpenter's upstream helmchart
https://github.com/aws/karpenter-provider-aws/releases

I tried the following, which failed and had 2 major bugs

new blueprints.addons.EksPodIdentityAgentAddOn()
//^-- my karpenter config depends on this, I also deplied this

new blueprints.addons.KarpenterAddOn({
    version: "1.0.2", //https://github.com/aws/karpenter-provider-aws/releases
    installCRDs: false, //temporarily needed for v1.0.2
    ec2NodeClassSpec: {
        amiFamily: "Bottlerocket",
        subnetSelectorTerms: [{ tags: { "Name": `${config.id}/${config.id}-vpc/PrivateSubnet*` } }],
        securityGroupSelectorTerms: [{ tags: { "aws:eks:cluster-name": `${config.id}` } }],
        detailedMonitoring: false,
        tags: config.tags,
    },
    nodePoolSpec: {
        requirements: [
            { key: 'topology.kubernetes.io/zone', operator: 'In', 
              values: [
                  `${config.vpc.availabilityZones[0]}`,
                  `${config.vpc.availabilityZones[1]}`,
                  `${config.vpc.availabilityZones[2]}`] },
            { key: 'kubernetes.io/arch', operator: 'In', values: ['amd64','arm64']},
            { key: 'karpenter.sh/capacity-type', operator: 'In', values: ['spot']}, //spot for lower-envs
        ],
        disruption: {           //WhenUnderutilized is more agressive cost savings / slightly worse stability
            consolidationPolicy: "WhenUnderutilized", 
            //consolidateAfter: "30s", //<--not compatible with WhenUnderutilized
            expireAfter: "20m",
            budgets: [{nodes: "10%"}] 
        }
    },
    interruptionHandling: true,
    podIdentity: true,
    values: { //https://github.com/aws/karpenter-provider-aws/tree/main/charts/karpenter#values
        replicas: 1,
    }
})

Expected Behavior

  • karpenter to work
    • (and AWS to offer better support for products they founded karpenter and eksblueprints for cdk are both founded by AWS rather than random members of the open source community. Karpenter's been 1.0.0 for a while now, it's surprising that this is still an issue.)

Current Behavior

What I originally tried (shown above) resulted in the following error

Error from server: error when creating "/tmp/manifest.yaml"
conversion webhook for karpenter.sh/v1beta1, Kind=NodePool failed: 
Post: "https://karpenter.kube-system.svc:8443/conversaion/karpenter.sh?=timeout=30s" service
karpenter not found.

To get cdk to at least allow me to successfully deploy my eks blueprints based stack, so I could debug it further, I simplified it to the following. After which I was able to at least deploy it, and investigate how it was broken:

new blueprints.addons.KarpenterAddOn({
    version: "1.0.2", //https://github.com/aws/karpenter-provider-aws/releases
    installCRDs: false, //temporarily needed for v1.0.2
    interruptionHandling: true,
    podIdentity: true,
    values: { //https://github.com/aws/karpenter-provider-aws/tree/main/charts/karpenter#values
        replicas: 1,
    }
})

There's 2 problems / bugs with the above:

  • 1st bug: It's creating a misconfigured CRD
    • Notice the error message mentions karpenter.kube-system.svc
      That tells me it's looking for karpenter installed in kube-system namespace, while this add-on installs karpenter in the karpenter namespace.
    • I ran kubectl get crd ec2nodeclasses.karpenter.k8s.aws -o yaml
      and saw the following relevant snippet of yaml, which tells me the generated CRD is generated incorrectly.
      spec:
        conversion:
          strategy: Webhook
          webhook:
            clientConfig:
              caBundle: LS0tLS1CRUdJTiBDRVJU...
              service:
                name: karpenter
                namespace: kube-system
                path: /conversion/karpenter.k8s.aws
                port: 8443
  • 2nd bug: installCRDs: false, was ignored
    • I wanted to attempt to work around the problem by telling the addon not to generate the broken crd, so I could implement a workaround fix, but this setting wasn't respected.

Reproduction Steps

  1. install an eks blueprints based cluster
  2. install any dependency addons (like pod identity agent)
    new blueprints.addons.EksPodIdentityAgentAddOn()
  3. install karpenter addon with a config like this
new blueprints.addons.KarpenterAddOn({
    version: "1.0.2", //https://github.com/aws/karpenter-provider-aws/releases
    installCRDs: false, //temporarily needed for v1.0.2
    interruptionHandling: true,
    podIdentity: true,
    values: { //https://github.com/aws/karpenter-provider-aws/tree/main/charts/karpenter#values
        replicas: 1,
    }
})

Possible Solution

This is a complicated issue, and may need to be fixed upstream. I'd recommend fixing it in phases / stages.

It'd be great if a fix for installCRDs: false could be prioritized, and I think that part is an eks blueprints specific bug/in scope of an issue that makes since to fix in this repo. (unless it is the upstream helm chart that's installing the crd?)
If that part were prioritized then manual workarounds would be easier to implement.

Additional Information/Context

Here's the upstream repos if it helps:

CDK CLI Version

2.133.0 (build dcc1e75)

EKS Blueprints Version

1.15.1

Node.js Version

v20.17.0

Environment details (OS name and version, etc.)

Mac OS Sonoma 14.6.1

Other information

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant