Skip to content
This repository has been archived by the owner on Mar 16, 2024. It is now read-only.

Add "resources" field to the computeClass definition #2380

Open
dciangot opened this issue Dec 15, 2023 · 1 comment
Open

Add "resources" field to the computeClass definition #2380

dciangot opened this issue Dec 15, 2023 · 1 comment

Comments

@dciangot
Copy link
Contributor

Introducing the field resources to the ProjectComputeClassInstance and, as a consequence, to ClusterComputeClassInstance will enable cluster admins to add GPU and any custom hardware to the "offering" for the users.

Note from Darren: "One catch is that the resources can only be applied to the "main container" not any sidecars, as you really don't know which one to apply to."
Complete discussion ref on slack

dciangot added a commit to dciangot/runtime that referenced this issue Dec 19, 2023
Enabling admins to create computeclass with GPU and accelarators in general.
dciangot added a commit to dciangot/runtime that referenced this issue Dec 19, 2023
Enabling admins to create computeclass with GPU and accelarators in general.

Signed-off-by: dciangot <[email protected]>
tylerslaton pushed a commit that referenced this issue Jan 3, 2024
Add resource field to computeclass

Signed-off-by: dciangot <[email protected]>
Signed-off-by: Diego Ciangottini <[email protected]>
Co-authored-by: Diego Ciangottini <[email protected]>
@sangee2004
Copy link
Contributor

Tested with acorn version - v0.10.0-rc2-9-g43dbcbf4+43dbcbf4

  1. Able to create a computeclass with resources field containing requests and limits using following yaml :
kind: ClusterComputeClass
apiVersion: admin.acorn.io/v1
default: false
metadata:
  name: cc-res
description: Large compute for linux on arm64
cpuScaler: 0.75
supportedRegions:
 - local
memory:
  default: 20M
  min: 10M
  max: 100M
resources:
  limits:
    gpu-vendor.example/example-gpu: 2
  requests:
    gpu-vendor.example/example-gpu: 2
  1. When deploying app using this compute class , I am able to see the pods having the resources set in the container spec as follows:
                        "resources": {
                            "limits": {
                                "gpu-vendor.example/example-gpu": "2",
                                "memory": "20M"
                            },
                            "requests": {
                                "cpu": "14m",
                                "gpu-vendor.example/example-gpu": "2",
                                "memory": "20M"
                            }
                        },

The app deployment itself fails in my case with following error since I dont have an environment that has nodes that support gpu which is as expected.

0/1 nodes are available: 1 Insufficient gpu-vendor.example/example-gpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants