Skip to content
This repository has been archived by the owner on Nov 2, 2021. It is now read-only.

dcgm-exporter can't run #212

Open
JohanOu opened this issue Sep 17, 2021 · 2 comments
Open

dcgm-exporter can't run #212

JohanOu opened this issue Sep 17, 2021 · 2 comments

Comments

@JohanOu
Copy link

JohanOu commented Sep 17, 2021

It logs this:
root@octopus-worker1:/home/practice# docker logs d674af870ff5
Starting NVIDIA host engine...
Got error 11 while waiting for SIGUSR1 from child process.
Collecting metrics at /run/prometheus/dcgm.prom every 1000ms...
Stopping NVIDIA host engine...
Unable to terminate host engine, it may not be running.
/usr/local/bin/dcgm-exporter: line 141: kill: (12) - No such process
Done

How to solve it?Thanks

@yongqiangz
Copy link

@JohanOu I have the same issue, have you solved it?

@yongqiangz
Copy link

in my case, it is because i upgrade GPU driver version, so when i upgrage dcgm-exporter version too, it works fine.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants