Add "resources" field to the computeClass definition #2380

dciangot · 2023-12-15T22:19:34Z

Introducing the field resources to the ProjectComputeClassInstance and, as a consequence, to ClusterComputeClassInstance will enable cluster admins to add GPU and any custom hardware to the "offering" for the users.

Note from Darren: "One catch is that the resources can only be applied to the "main container" not any sidecars, as you really don't know which one to apply to."
Complete discussion ref on slack

Enabling admins to create computeclass with GPU and accelarators in general.

Enabling admins to create computeclass with GPU and accelarators in general. Signed-off-by: dciangot <[email protected]>

Add resource field to computeclass Signed-off-by: dciangot <[email protected]> Signed-off-by: Diego Ciangottini <[email protected]> Co-authored-by: Diego Ciangottini <[email protected]>

sangee2004 · 2024-01-10T23:28:34Z

Tested with acorn version - v0.10.0-rc2-9-g43dbcbf4+43dbcbf4

Able to create a computeclass with resources field containing requests and limits using following yaml :

kind: ClusterComputeClass
apiVersion: admin.acorn.io/v1
default: false
metadata:
  name: cc-res
description: Large compute for linux on arm64
cpuScaler: 0.75
supportedRegions:
 - local
memory:
  default: 20M
  min: 10M
  max: 100M
resources:
  limits:
    gpu-vendor.example/example-gpu: 2
  requests:
    gpu-vendor.example/example-gpu: 2

When deploying app using this compute class , I am able to see the pods having the resources set in the container spec as follows:

                        "resources": {
                            "limits": {
                                "gpu-vendor.example/example-gpu": "2",
                                "memory": "20M"
                            },
                            "requests": {
                                "cpu": "14m",
                                "gpu-vendor.example/example-gpu": "2",
                                "memory": "20M"
                            }
                        },

The app deployment itself fails in my case with following error since I dont have an environment that has nodes that support gpu which is as expected.

0/1 nodes are available: 1 Insufficient gpu-vendor.example/example-gpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod

dciangot added a commit to dciangot/runtime that referenced this issue Dec 19, 2023

Fix acorn-io#2380 : introduce computeclass resources field

b75741a

Enabling admins to create computeclass with GPU and accelarators in general.

dciangot added a commit to dciangot/runtime that referenced this issue Dec 19, 2023

Fix acorn-io#2380 : introduce computeclass resources field

236d5d7

Enabling admins to create computeclass with GPU and accelarators in general. Signed-off-by: dciangot <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "resources" field to the computeClass definition #2380

Add "resources" field to the computeClass definition #2380

dciangot commented Dec 15, 2023

sangee2004 commented Jan 10, 2024

Add "resources" field to the computeClass definition #2380

Add "resources" field to the computeClass definition #2380

Comments

dciangot commented Dec 15, 2023

sangee2004 commented Jan 10, 2024