Feature: Natively Support IRSA on EKS Properly #200

callum-tait-pbx · 2020-11-17T18:27:10Z

This has come up in a few issues so I thought it would be good to create a issue specifically tackling implementing support. Additionally, having this explicitly called out will help people who want to implement the project on EKS in the meantime.

Classing it as a feature as there is a workaround and the project doesn't say it was created specifically for EKS.

Problem
EKS IAM Roles for Service Accounts (IRSA) is how AWS have implemented support role based authentication for their managed K8s service, EKS. The implementation does some manipulation of pods automatically injecting a volume and some environment variables. This automagic results in the controller spinning up a pod that is not as it expects resulting in a terminate-create loop.

Workaround
The workaround is to explicitly set the yaml that normally gets injected so the controller expects it preventing it fromt freaking out and getting stuck in a terminate-create loop.

Expected Behaviour
Only IRSA needs to be configured https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html, the values should get automatically injected into the runner definitions and the controller should not freak out as a result.

Normally injected YAML once IRSA is configured on your cluster:

  volumes:
  - name: aws-iam-token
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          audience: sts.amazonaws.com
          expirationSeconds: 86400
          path: token

volumeMounts:
    - mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
      name: aws-iam-token
      readOnly: true

env:
    - name: AWS_ROLE_ARN
      value: arn:aws:iam::xxxxxx:role/xxxxx
    - name: AWS_WEB_IDENTITY_TOKEN_FILE
      value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token

Existing Issues That Reference this Bug/Feature
#16
#37

mumoshu · 2020-11-19T00:02:36Z

@callum-tait-pbx Thank you so much for writing this up 🙏

I wasn't really sure what's happening on the mentioned issues until now. But after reading this, I got to think that this might be due to that how we mark a runner pod to be restarted:

https://github.com/summerwind/actions-runner-controller/blob/master/controllers/runner_controller.go#L201-L203

Perhaps we'd need to use some other criteria to mark runner pods to be restarted, so that it won't interfere with mutating webhooks - perhaps inject a pod spec hash into the runner pod annotations and compare that against the desired pod spec hash instead?

callum-tait-pbx · 2020-11-24T18:10:01Z

Apologies for the late response, busy times at the moment! That would make a lot of sense to me.

perhaps inject a pod spec hash into the runner pod annotations and compare that against the desired pod spec hash instead?

I quite like idea

mumoshu · 2020-11-30T00:11:57Z

Thanks for confirming! I'm going to work on this in the next few days

One of the pod recreation conditions has been modified to use hash of runner spec, so that the controller does not keep restarting pods mutated by admission webhooks. This naturally allows us, for example, to use IRSA for EKS that requires its admission webhook to mutate the runner pod to have additional, IRSA-related volumes, volume mounts and env. Resolves #200

mumoshu · 2020-12-07T00:41:46Z

@callum-tait-pbx I'm working on this at #226

One of the pod recreation conditions has been modified to use hash of runner spec, so that the controller does not keep restarting pods mutated by admission webhooks. This naturally allows us, for example, to use IRSA for EKS that requires its admission webhook to mutate the runner pod to have additional, IRSA-related volumes, volume mounts and env. Resolves #200

mumoshu · 2020-12-08T09:10:53Z

@callum-tait-pbx Could you try v0.15.0? There's also a small documentation about this feature at https://github.com/summerwind/actions-runner-controller#using-eks-iam-role-for-service-accounts for your reference.

mumoshu self-assigned this Nov 30, 2020

mumoshu mentioned this issue Dec 7, 2020

feat: EKS IAM Roles for Service Accounts for Runner Pods #226

Merged

mumoshu closed this as completed in #226 Dec 8, 2020

mumoshu mentioned this issue Apr 25, 2021

Runner pod always comes back in Terminating state #37

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Natively Support IRSA on EKS Properly #200

Feature: Natively Support IRSA on EKS Properly #200

callum-tait-pbx commented Nov 17, 2020 •

edited

Loading

mumoshu commented Nov 19, 2020

callum-tait-pbx commented Nov 24, 2020

mumoshu commented Nov 30, 2020

mumoshu commented Dec 7, 2020

mumoshu commented Dec 8, 2020

Feature: Natively Support IRSA on EKS Properly #200

Feature: Natively Support IRSA on EKS Properly #200

Comments

callum-tait-pbx commented Nov 17, 2020 • edited Loading

mumoshu commented Nov 19, 2020

callum-tait-pbx commented Nov 24, 2020

mumoshu commented Nov 30, 2020

mumoshu commented Dec 7, 2020

mumoshu commented Dec 8, 2020

callum-tait-pbx commented Nov 17, 2020 •

edited

Loading