-
Notifications
You must be signed in to change notification settings - Fork 881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RBAC Access denied from Pipeline Run pod #2794
Comments
Not sure if related, but am looking at the
Playing a bit around with kubectl auth can-i \
create \
workflowtaskresults.argoproj.io \
-n kubeflow-user-example-com \
--as system:serviceaccount:kubeflow-user-example-com:default-editor
# no
kubectl auth can-i \
create \
workflows.argoproj.io \
-n kubeflow-user-example-com \
--as system:serviceaccount:kubeflow-user-example-com:default-editor
# yes |
Setting the So next step is to understand if the problem is with the communication between the KFP components or from requests from pods in user namespaces to KFP pods |
Specifically, adding the to the authorization the following rule makes the runs to succeed - when:
- key: request.headers[kubeflow-userid]
notValues: ['*'] So there's a good chance that this was broken because of #2747 |
My latest understanding is the following:
But the pipeline steps themselves don't set any JWT. So that's the reason we see the error in the steps themselves. |
My proposal to unblock is the following AuthorizationPolicy: apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
labels:
app.kubernetes.io/component: ml-pipeline
app.kubernetes.io/name: kubeflow-pipelines
application-crd-id: kubeflow-pipelines
name: ml-pipeline
namespace: kubeflow
spec:
rules:
- from:
- source:
principals:
- cluster.local/ns/kubeflow/sa/ml-pipeline
- cluster.local/ns/kubeflow/sa/ml-pipeline-ui
- cluster.local/ns/kubeflow/sa/ml-pipeline-persistenceagent
- cluster.local/ns/kubeflow/sa/ml-pipeline-scheduledworkflow
- cluster.local/ns/kubeflow/sa/ml-pipeline-viewer-crd-service-account
- cluster.local/ns/kubeflow/sa/kubeflow-pipelines-cache
- from:
- source:
requestPrincipals:
- '*'
- when:
- key: request.headers[kubeflow-userid]
notValues: ['*']
selector:
matchLabels:
app: ml-pipeline This essentially will allow requests if:
Also note that our current |
@juliusvonkohout @kromanow94 do you also see the above issue as well? |
I think RC.2 has 43eec94 according to v1.9.0-rc.1...1.9.0-rc.2 and people claim that it is working in #2611 (comment). Argo 3.4 in KFP 2.2.0 brought in workflowtaskresults.argoproj.io i guess. Maybe we are missing this permission in the roles. CC also @rimolive I am also fine with your authentication approach @kimwnasptd, but i am just on vacation until July 20. So i will cut the final relase around June 20-21 and do the changelog for it, but i cannot assist with this topic here much until July 20. |
A tests might be broken partially and could explain why this has not been detected in https://github.com/kubeflow/manifests/actions/runs/9825018339/job/27124646100 We allow the status failed as well
But the test should fail if the pipeline status is failed at
It should be if status != "SUCCEEDED" exit 1 or so |
@kimwnasptd add some details about my comment about the 1.9.0-rc2 in IKS. In IKS manifest repo v1.9-branch which is downstream of kubeflow/manifests, I did add this patch to make the pipeline works:
|
Hey, yes, I can see that on my end as well. I'll make a PR with the changes described by @kimwnasptd and also fix the gh-workflow test. I think it makes a good sense. This also makes me wonder if we'd like to improve the security sometime in future and configure the Steps to authenticate with SA Token to the ml-pipeline endpoint. |
PR is here: #2795 |
Thank you for the PR @kromanow94. i adjusted the tests slightly to trigger all KFP tests, checked the outputs and merged it. Feel free to reopen if that is not enough. |
I can confirm I had the exact same issue in the latest Kubeflow 1.9 rc2. Thanks @kromanow94 F0718 09:04:11.900602 20 main.go:79] KFP driver: driver.Container(pipelineName=pipeline, runID=96a02a1a-a08a-4fbf-8d5d-169bcc305326, task="cat-hosts", component="comp-kfp-busybox", dagExecutionID=4, componentSpec, KubernetesExecutorConfig) failed: failure while getting executionCache: failed to list tasks: rpc error: code = PermissionDenied desc = RBAC: access denied
time="2024-07-18T09:04:12.435Z" level=info msg="sub-process exited" argo=true error="<nil>"
time="2024-07-18T09:04:12.435Z" level=error msg="cannot save parameter /tmp/outputs/pod-spec-patch" argo=true error="open /tmp/outputs/pod-spec-patch: no such file or directory"
time="2024-07-18T09:04:12.435Z" level=error msg="cannot save parameter /tmp/outputs/cached-decision" argo=true error="open /tmp/outputs/cached-decision: no such file or directory"
time="2024-07-18T09:04:12.435Z" level=info msg="/tmp/outputs/condition -> /var/run/argo/outputs/parameters//tmp/outputs/condition" argo=true |
Validation Checklist
Version
1.9
Describe your issue
Using the RC2 of Kubeflow and trying to create a Pipeline run. The pipeline fails and I see in the logs of the driver pod of the first step:
Looks like someone (Istio most probably) is denying a request from the driver to get the
executionCache
Steps to reproduce the issue
Put here any screenshots or videos (optional)
No response
The text was updated successfully, but these errors were encountered: