Support to specify CUDA visible device id for model service #444

nkwangleiGIT · 2023-12-25T14:11:05Z

CUDA_DEVICE_ORDER=PCI_BUS_ID
CUDA_VISIBLE_DEVICES="0,3" # specify which GPU(s) to be used

For scenario that there are different type of GPUs on the same node, like T4 and A100 etc. We should support to deploy model with specified device id.

bjwswang · 2023-12-26T02:18:38Z

To RunnerFastchat, we can configure this variables --gpus to the command
https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/model_worker.py#L311

but experiment must be made to make sure this can work with k8s's gpu scheduling

For RunnerFastchatVLLM, there is no such configuration --gpus.

Based on ray doc https://docs.ray.io/en/latest/ray-core/scheduling/accelerators.html#starting-ray-nodes-with-accelerators
, it seems we can configure this CUDA_VISIBLE_DEVICES when start/configure ray node so the RunnerFastchatVLLM will only detect the devices exposed by ray node.

bjwswang · 2023-12-26T02:20:51Z

Furthermore if we allow user to specifiy gpus , we need to provide user a way to view current available gpus which is easy if he/she has the permission to the host(not appropriate for all users)

This issue can be splitted to two tasks:

RunnerFastchat`(single node)
RunnerFastchatVLLM (distributed)

nkwangleiGIT · 2023-12-30T00:40:45Z

let me do a basic support like below:

Support to configure nodeSelector and cuda visible devices when deploy model on the API/ops-console

No ray support, then it'll deploy the model to node(s) matching the nodeSelector and use the specified GPUs - like the single node above
With ray support, it's the same and will construct a GPU pool if there are multiple nodes involved - like the distributed above
For ray support, will cover it in Multiple gpus on different nodes #427

nkwangleiGIT · 2024-01-09T14:08:36Z

I think we can support it now following the docs below:
http://kubeagi.k8s.com.cn/docs/Configuration/gpu-and-node-affinity

bjwswang self-assigned this Dec 26, 2023

nkwangleiGIT added this to the v0.2.0 milestone Dec 30, 2023

nkwangleiGIT assigned nkwangleiGIT and unassigned bjwswang Dec 30, 2023

nkwangleiGIT closed this as completed Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support to specify CUDA visible device id for model service #444

Support to specify CUDA visible device id for model service #444

nkwangleiGIT commented Dec 25, 2023 •

edited

Loading

bjwswang commented Dec 26, 2023

bjwswang commented Dec 26, 2023 •

edited

Loading

nkwangleiGIT commented Dec 30, 2023

nkwangleiGIT commented Jan 9, 2024

Support to specify CUDA visible device id for model service #444

Support to specify CUDA visible device id for model service #444

Comments

nkwangleiGIT commented Dec 25, 2023 • edited Loading

bjwswang commented Dec 26, 2023

bjwswang commented Dec 26, 2023 • edited Loading

nkwangleiGIT commented Dec 30, 2023

nkwangleiGIT commented Jan 9, 2024

nkwangleiGIT commented Dec 25, 2023 •

edited

Loading

bjwswang commented Dec 26, 2023 •

edited

Loading