-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #533 from bjwswang/charts
feat: add kuberay operator and configure ray clusters in arcadia
- Loading branch information
Showing
25 changed files
with
51,116 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
apiVersion: ray.io/v1 | ||
kind: RayCluster | ||
metadata: | ||
name: raycluster-kuberay | ||
namespace: kuberay-system | ||
spec: | ||
headGroupSpec: | ||
rayStartParams: | ||
dashboard-host: 0.0.0.0 | ||
template: | ||
metadata: | ||
labels: | ||
app.kubernetes.io/instance: raycluster | ||
app.kubernetes.io/name: kuberay | ||
spec: | ||
containers: | ||
- image: kubeagi/ray-ml:2.9.0-py39-vllm | ||
name: ray-head | ||
resources: | ||
limits: | ||
cpu: "1" | ||
memory: 2G | ||
nvidia.com/gpu: 1 | ||
requests: | ||
cpu: "1" | ||
memory: 2G | ||
nvidia.com/gpu: 1 | ||
volumeMounts: | ||
- mountPath: /tmp/ray | ||
name: log-volume | ||
volumes: | ||
- emptyDir: {} | ||
name: log-volume | ||
workerGroupSpecs: | ||
- groupName: workergroup | ||
replicas: 0 | ||
minReplicas: 0 | ||
maxReplicas: 5 | ||
rayStartParams: {} | ||
template: | ||
metadata: | ||
labels: | ||
app.kubernetes.io/instance: raycluster | ||
app.kubernetes.io/name: kuberay | ||
spec: | ||
containers: | ||
- image: kubeagi/ray-ml:2.9.0-py39-vllm | ||
name: ray-worker | ||
resources: | ||
limits: | ||
cpu: "1" | ||
memory: 1G | ||
nvidia.com/gpu: 1 | ||
requests: | ||
cpu: "1" | ||
memory: 1G | ||
nvidia.com/gpu: 1 | ||
volumeMounts: | ||
- mountPath: /tmp/ray | ||
name: log-volume | ||
volumes: | ||
- emptyDir: {} | ||
name: log-volume |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Patterns to ignore when building packages. | ||
# This supports shell glob matching, relative path matching, and | ||
# negation (prefixed with !). Only one pattern per line. | ||
.DS_Store | ||
# Common VCS dirs | ||
.git/ | ||
.gitignore | ||
.bzr/ | ||
.bzrignore | ||
.hg/ | ||
.hgignore | ||
.svn/ | ||
# Common backup files | ||
*.swp | ||
*.bak | ||
*.tmp | ||
*~ | ||
# Various IDEs | ||
.project | ||
.idea/ | ||
*.tmproj | ||
.vscode/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
apiVersion: v2 | ||
description: A Helm chart for Kubernetes | ||
name: kuberay-operator | ||
version: 1.0.0 | ||
icon: https://github.com/ray-project/ray/raw/master/doc/source/images/ray_header_logo.png | ||
type: application |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
# KubeRay Operator | ||
|
||
This document provides instructions to install both CRDs (RayCluster, RayJob, RayService) and KubeRay operator with a Helm chart. | ||
|
||
## Helm | ||
|
||
Make sure the version of Helm is v3+. Currently, [existing CI tests](https://github.com/ray-project/kuberay/blob/master/.github/workflows/helm-lint.yaml) are based on Helm v3.4.1 and v3.9.4. | ||
|
||
```sh | ||
helm version | ||
``` | ||
|
||
## Install CRDs and KubeRay operator | ||
|
||
* Install a stable version via Helm repository (only supports KubeRay v0.4.0+) | ||
```sh | ||
helm repo add kuberay https://ray-project.github.io/kuberay-helm/ | ||
|
||
# Install both CRDs and KubeRay operator v1.0.0. | ||
helm install kuberay-operator kuberay/kuberay-operator --version 1.0.0 | ||
|
||
# Check the KubeRay operator Pod in `default` namespace | ||
kubectl get pods | ||
# NAME READY STATUS RESTARTS AGE | ||
# kuberay-operator-6fcbb94f64-mbfnr 1/1 Running 0 17s | ||
``` | ||
|
||
* Install the nightly version | ||
```sh | ||
# Step1: Clone KubeRay repository | ||
# Step2: Move to `helm-chart/kuberay-operator` | ||
# Step3: Install KubeRay operator | ||
helm install kuberay-operator . | ||
``` | ||
|
||
* Install KubeRay operator without installing CRDs | ||
* In some cases, the installation of the CRDs and the installation of the operator may require different levels of admin permissions, so these two installations could be handled as different steps by different roles. | ||
* Use Helm's built-in `--skip-crds` flag to install the operator only. See [this document](https://helm.sh/docs/chart_best_practices/custom_resource_definitions/) for more details. | ||
```sh | ||
# Step 1: Install CRDs only (for cluster admin) | ||
kubectl create -k "github.com/ray-project/kuberay/manifests/cluster-scope-resources?ref=v1.0.0&timeout=90s" | ||
# Step 2: Install KubeRay operator only. (for developer) | ||
helm install kuberay-operator kuberay/kuberay-operator --version 1.0.0 --skip-crds | ||
``` | ||
## List the chart | ||
To list the `my-release` deployment: | ||
```sh | ||
helm ls | ||
# NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION | ||
# kuberay-operator default 1 2023-09-22 02:57:17.306616331 +0000 UTC deployed kuberay-operator-1.0.0 | ||
``` | ||
## Uninstall the Chart | ||
```sh | ||
# Uninstall the `kuberay-operator` release | ||
helm uninstall kuberay-operator | ||
# The operator Pod should be removed. | ||
kubectl get pods | ||
# No resources found in default namespace. | ||
``` | ||
## Working with Argo CD | ||
If you are using [Argo CD](https://argoproj.github.io) to manage the operator, you will encounter the issue which complains the CRDs too long. Same with [this issue](https://github.com/prometheus-operator/prometheus-operator/issues/4439). | ||
The recommended solution is to split the operator into two Argo apps, such as: | ||
* The first app just for installing the CRDs with `Replace=true` directly, snippet: | ||
```yaml | ||
apiVersion: argoproj.io/v1alpha1 | ||
kind: Application | ||
metadata: | ||
name: ray-operator-crds | ||
spec: | ||
project: default | ||
source: | ||
repoURL: https://github.com/ray-project/kuberay | ||
targetRevision: v1.0.0-rc.0 | ||
path: helm-chart/kuberay-operator/crds | ||
destination: | ||
server: https://kubernetes.default.svc | ||
syncPolicy: | ||
syncOptions: | ||
- Replace=true | ||
... | ||
``` | ||
* The second app that installs the Helm chart with `skipCrds=true` (new feature in Argo CD 2.3.0), snippet: | ||
```yaml | ||
apiVersion: argoproj.io/v1alpha1 | ||
kind: Application | ||
metadata: | ||
name: ray-operator | ||
spec: | ||
source: | ||
repoURL: https://github.com/ray-project/kuberay | ||
targetRevision: v1.0.0-rc.0 | ||
path: helm-chart/kuberay-operator | ||
helm: | ||
skipCrds: true | ||
destination: | ||
server: https://kubernetes.default.svc | ||
namespace: ray-operator | ||
syncPolicy: | ||
syncOptions: | ||
- CreateNamespace=true | ||
... | ||
``` |
Oops, something went wrong.