You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Obviously, running GitLab runner on an own machine is cumbersome. To reenable running in the cloud at GitLab CI, the image should be minified more to meet the target of somewhat under 3.75 GB.
The text was updated successfully, but these errors were encountered:
As a workaround, I used the rocm/dev-ubuntu-20.04 docker image, installed rccl via apt and then tensorflow-rocm via pip. Here are some successful jobs executing this approach:
Since upgrading to
rocm/tensorflow:rocm4.0-tf2.4-dev
, my pipeline jobs on GitLab.com fail:https://gitlab.com/pfasdr/code/decoder/-/jobs/937693433
https://gitlab.com/pfasdr/code/decoder/-/jobs/937693435
The relevant error message is:
As the documentation states, the shared runners on GitLab.com use
https://docs.gitlab.com/ee/user/gitlab_com/#linux-shared-runners
These have only 3.75 GB of memory and cannot download the docker image of currently 5.39 GB:
https://cloud.google.com/compute/docs/machine-types#n1_machine_types
When I run the jobs on my local machine via a GitLab runner registered to as a group runner, they execute as expected:
https://gitlab.com/pfasdr/code/decoder/-/jobs/937751331
https://gitlab.com/pfasdr/code/decoder/-/jobs/937746578
Obviously, running GitLab runner on an own machine is cumbersome. To reenable running in the cloud at GitLab CI, the image should be minified more to meet the target of somewhat under 3.75 GB.
The text was updated successfully, but these errors were encountered: