Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple ray submitter, worker, and head resources #2924

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Sovietaced
Copy link

@Sovietaced Sovietaced commented Nov 13, 2024

Tracking issue

Related to flyteorg/flyte#5666

Why are the changes needed?

These changes update the flytekit-ray package such that users can configure pod specs for worker and head workloads.

What changes were proposed in this pull request?

Updating the version of flyteidl and plumb k8s pods through worker and head config.

How was this patch tested?

See unit tests.

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

flyteorg/flyte#5933

@Sovietaced Sovietaced changed the title wip Decouple ray submitter, worker, and head resources Nov 13, 2024
Copy link

codecov bot commented Nov 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.07%. Comparing base (3f0ab84) to head (2db498c).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2924      +/-   ##
==========================================
+ Coverage   76.33%   79.07%   +2.73%     
==========================================
  Files         199      199              
  Lines       20840    20840              
  Branches     2681     2681              
==========================================
+ Hits        15908    16479     +571     
+ Misses       4214     3622     -592     
- Partials      718      739      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


from flytekit.models import common as _common


class K8sObjectMetadata(_common.FlyteIdlEntity):
Copy link
Author

@Sovietaced Sovietaced Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took these from flytekit/models/task.py instead of reusing them. I'm happy to just reuse the other models but I wasn't sure if we would want to simplify these for the ray use case or ensure they are decoupled so there are no unintended regressions.

Signed-off-by: Jason Parraga <[email protected]>

@property
def ray_start_params(self):
"""
The ray start params of worker node group.
The ray start params of head node group.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lil typo

self,
metadata: K8sObjectMetadata = None,
pod_spec: typing.Dict[str, typing.Any] = None,
data_config: typing.Optional[DataLoadingConfig] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop this. Also why add this K8s pod. We dont need that, as Ray config will simply float like a json. So just use K8s pod object?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can keep the k8s pod too, i see that you are actually setting the pod properties.
This way we could also use pod template?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we had a discussion about this on the backend change here: flyteorg/flyte#5933 (comment)

It started as just adding support for resources but we realized it would be more flexible if we added support for something similar to a pod template since that is ultimately what the kuberay contract is.

@@ -89,7 +92,9 @@ def get_custom(self, settings: SerializationSettings) -> Optional[Dict[str, Any]
ray_job = RayJob(
ray_cluster=RayCluster(
head_group_spec=(
HeadGroupSpec(cfg.head_node_config.ray_start_params) if cfg.head_node_config else None
HeadGroupSpec(cfg.head_node_config.ray_start_params, cfg.head_node_config.k8s_pod)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is not great
cc @EngHabu @pingsutw
I would have loved us to model it more like a json so that modifying it would be faster without needing a protobuf change.

But i see what you are doing now

Signed-off-by: Jason Parraga <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants