Adding nodeSelectors to GPUs #1331

SandraGH5 · 2021-08-12T19:37:35Z

SandraGH5
Aug 12, 2021
Collaborator

We have nodepools with different GPU cards in our cluster and would like to be able to choose a specific one for a task. For non-flyte workloads, we do this by adding a nodeSelector label gpu_type to the pod and label our nodes accordingly. We do the same to schedule some pods on nodes with a faster cpu if we care about single core performance.
We managed to add such nodeSelectors by using the Flytekit pod plugin, which transformed our task into a "Sidecar Task". However, we now lost the ability to execute the task locally, which is one of our most used features of Flyte:

Local execute is not currently supported for pod tasks

Is there any other way to add such a nodeSelector to a task or a different approach to achieve what we intend to do?
Any help is much appreciated.

-Stephan Gref

Answered by SandraGH5

Aug 12, 2021

Flyte propeller allows you to specify a toleration for the GPU resource: https://github.com/flyteorg/flyteplugins/blob/892f35eb8a0041969039e56b64a0467f17e6809c/go/tasks/pluginmachinery/flytek8s/config/config.go#L99
and it adds the appropriate resource to the pod so that it gets scheduled on a node with GPUs: https://github.com/flyteorg/flyteplugins/blob/892f35eb8a0041969039e56b64a0467f17e6809c/go/tasks/pluginmachinery/flytek8s/container_helper.go#L23

When you specify a gpu resource in the task resource requirements, it will add the correct resource requirement, and the appropriate toleration to the pod so that it can schedule on the right node pool. Using multiple gpu node pools will like…

View full answer

SandraGH5 · 2021-08-12T20:27:21Z

SandraGH5
Aug 12, 2021
Collaborator Author

Flyte propeller allows you to specify a toleration for the GPU resource: https://github.com/flyteorg/flyteplugins/blob/892f35eb8a0041969039e56b64a0467f17e6809c/go/tasks/pluginmachinery/flytek8s/config/config.go#L99
and it adds the appropriate resource to the pod so that it gets scheduled on a node with GPUs: https://github.com/flyteorg/flyteplugins/blob/892f35eb8a0041969039e56b64a0467f17e6809c/go/tasks/pluginmachinery/flytek8s/container_helper.go#L23

When you specify a gpu resource in the task resource requirements, it will add the correct resource requirement, and the appropriate toleration to the pod so that it can schedule on the right node pool. Using multiple gpu node pools will likely require a sidecar task for now.

Specifying a GPU resource without using a sidecar task only works with a single node selector.

Since you have multiple cards and presumably multiple node selectors that are task-specific, you could add a wrapper around the task decorator that applies the correct task type based on an environment variable. That way you can say RUN_IN_LOCAL_MODE=true or something equivalent, and it will create a regular python task instead that should run locally.

Just unset env var during registration and you should be good.

Basically, you define a different task type, inject your gpu node selector info into the custom attribute of the task template, parse and apply it in the flyte propeller plugin when building the pod.

This is not included in the standard task because Flyte has the concept of pod tasks, separate from container tasks.
Container tasks are meant to be simple, that run in containers, and may need specific resources. This makes it possible to have minimum assumptions about the environment and hence open to a lot of optimizations in the future.
It is more portable. For example, on AWS you can execute these on ECS, instead of k8s (or AWS batch), and also execute in alternate container execution environments.
The moment you decide to select specific node pools, you are exposing your backend topology to the user, which implies they know a lot of your system and there is a weird coupling. This prevents easy migrations, updates etc.
But, we did end up realizing that in some cases it is desirable to control the execution environment and hence we support pod tasks.
About local execution, I think we can support local execution for some pod tasks. Some - because, if you need multiple containers, volumes, ports, etc. (pod-specific things) then executing it locally is hard. But, that being said we can support simplified local execution and fail the execution if any advanced features are used.

This just needs an implementation of the local_execute method:

task.py

        raise _user_exceptions.FlyteUserException("Local execute is not currently supported for pod tasks")

See also GitHub Issue #1328:
[Plugin][Pod] Support local execution of simple pod tasks

@jeevb
@kumare3

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding nodeSelectors to GPUs #1331

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Adding nodeSelectors to GPUs #1331

SandraGH5 Aug 12, 2021 Collaborator

Replies: 1 comment

SandraGH5 Aug 12, 2021 Collaborator Author

SandraGH5
Aug 12, 2021
Collaborator

SandraGH5
Aug 12, 2021
Collaborator Author