Skip to content

Commit

Permalink
Fix several minimal bugs (#107)
Browse files Browse the repository at this point in the history
* disable GPU for PS

* fix the bugs of sedding yaml

* revert the change
  • Loading branch information
cheyang authored and k8s-ci-robot committed Jan 18, 2019
1 parent 5826ff9 commit 7cb79fe
Show file tree
Hide file tree
Showing 4 changed files with 4 additions and 4 deletions.
2 changes: 1 addition & 1 deletion charts/tfjob/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,4 +61,4 @@

### 0.15.0

* Fix hostnetwork issue which is introduced by ENI
* Fix hostnetwork issue which is introduced by ENI
2 changes: 1 addition & 1 deletion cmd/arena/commands/trainer_tensorflow.go
Original file line number Diff line number Diff line change
Expand Up @@ -453,7 +453,7 @@ func hasCondition(status tfv1alpha2.TFJobStatus, condType tfv1alpha2.TFJobCondit
}

func checkStatus(status tfv1alpha2.TFJobStatus) tfv1alpha2.TFJobConditionType {
t := tfv1alpha2.TFJobConditionType("Unknown")
t := tfv1alpha2.TFJobConditionType("Pending")
for _, condition := range status.Conditions {
if condition.Status == v1.ConditionTrue {
t = condition.Type
Expand Down
2 changes: 1 addition & 1 deletion docs/userguide/9-top-job-gpu-metric.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ kubectl apply -f kubernetes-artifacts/prometheus/prometheus.yaml

```
# change gpu export nodeSelector to aliyun label
sed 's|accelerator/nvidia_gpu|aliyun.accelerator/nvidia_count|g' kubernetes-artifacts/prometheus/gpu-expoter.yaml
sed -i 's|accelerator/nvidia_gpu|aliyun.accelerator/nvidia_count|g' kubernetes-artifacts/prometheus/gpu-expoter.yaml
```

* If your cluster is not ACK cluster, you need to label your GPU node:
Expand Down
2 changes: 1 addition & 1 deletion run_arena.sh
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ fi

if [ "$usePrometheus" == "true" ]; then
if [ "$platform" == "ack" ]; then
sed 's|accelerator/nvidia_gpu|aliyun.accelerator/nvidia_count|g' /root/kubernetes-artifacts/prometheus/gpu-expoter.yaml
sed -i 's|accelerator/nvidia_gpu|aliyun.accelerator/nvidia_count|g' /root/kubernetes-artifacts/prometheus/gpu-expoter.yaml
fi
if ! kubectl get serviceaccount --all-namespaces | grep prometheus; then
kubectl apply -f /root/kubernetes-artifacts/prometheus/gpu-expoter.yaml
Expand Down

0 comments on commit 7cb79fe

Please sign in to comment.