Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvidia-container-cli: mount error: failed to add device rules, permission denied #209

Open
sergeimonakhov opened this issue Apr 19, 2023 · 1 comment

Comments

@sergeimonakhov
Copy link

sergeimonakhov commented Apr 19, 2023

Hi,

I tried to forward the GPU to the container using nvidia-container-toolkit v1.13.0, ubuntu 22.04(cgroupv2) and linux kernel 6.1.24. I got error:

nvidia-container-cli: mount error: failed to add device rules: unable to generate new device filter program with no existing programs: unable to create new device filters program: load program: permission denied: 0: R1=ctx(off=0,imm=0) R10=fp0\\n0: (69) r2 = *(u16 *)(r1 +0)          ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=65535,var_off=(0x0; 0xffff))\\n1: (61) r3 = *(u32 *)(r1 +0)          ; R1=ctx(off=0,imm=0) R3_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))\\n2: (74) w3 >>= 16                     ; R3_w=scalar(umax=65535,var_off=(0x0; 0xffff))\\n3: (61) r4 = *(u32 *)(r1 +4)          ; R1=ctx(off=0,imm=0) R4_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))\\n4: (61) r5 = *(u32 *)(r1 +8)          ; R1=ctx(off=0,imm=0) R5_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))\\n5: (55) if r2 != 0x2 goto pc+7        ; R2_w=2\\n6: (bc) w2 = w3                       ; R2_w=scalar(umax=65535,var_off=(0x0; 0xffff)) R3_w=scalar(umax=65535,var_off=(0x0; 0xffff))\\n7: (54) w2 &= 6                       ; R2_w=scalar(umax=6,var_off=(0x0; 0x6))\\n8: (15) if r2 == 0x0 goto pc+4        ; R2_w=scalar(umax=6,var_off=(0x0; 0x6))\\n9: (55) if r4 != 0xc3 goto pc+3       ; R4=195\\n10: (55) if r5 != 0xff goto pc+2 13: R1=ctx(off=0,imm=0) R2=scalar(umax=6,var_off=(0x0; 0x6)) R3=scalar(umax=65535,var_off=(0x0; 0xffff)) R4=195 R5=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0\\n13: (95) exit\\nR0 !read_ok\\nprocessed 14 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1\\n\

If I use old kernel <6.x.x then there is no such problem and also if we switch to cgroupv1

@elezar
Copy link
Member

elezar commented Apr 24, 2023

Could this be related to #176 and the hardening around eBPF programs? Does the workaround suggested there also work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants