-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple ray submitter, worker, and head resources #2924
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Jason Parraga <[email protected]>
5e8306a
to
c621359
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2924 +/- ##
==========================================
+ Coverage 76.33% 79.07% +2.73%
==========================================
Files 199 199
Lines 20840 20840
Branches 2681 2681
==========================================
+ Hits 15908 16479 +571
+ Misses 4214 3622 -592
- Partials 718 739 +21 ☔ View full report in Codecov by Sentry. |
|
||
from flytekit.models import common as _common | ||
|
||
|
||
class K8sObjectMetadata(_common.FlyteIdlEntity): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took these from flytekit/models/task.py
instead of reusing them. I'm happy to just reuse the other models but I wasn't sure if we would want to simplify these for the ray use case or ensure they are decoupled so there are no unintended regressions.
Signed-off-by: Jason Parraga <[email protected]>
|
||
@property | ||
def ray_start_params(self): | ||
""" | ||
The ray start params of worker node group. | ||
The ray start params of head node group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lil typo
self, | ||
metadata: K8sObjectMetadata = None, | ||
pod_spec: typing.Dict[str, typing.Any] = None, | ||
data_config: typing.Optional[DataLoadingConfig] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop this. Also why add this K8s pod. We dont need that, as Ray config will simply float like a json. So just use K8s pod object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can keep the k8s pod too, i see that you are actually setting the pod properties.
This way we could also use pod template?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we had a discussion about this on the backend change here: flyteorg/flyte#5933 (comment)
It started as just adding support for resources but we realized it would be more flexible if we added support for something similar to a pod template since that is ultimately what the kuberay contract is.
@@ -89,7 +92,9 @@ def get_custom(self, settings: SerializationSettings) -> Optional[Dict[str, Any] | |||
ray_job = RayJob( | |||
ray_cluster=RayCluster( | |||
head_group_spec=( | |||
HeadGroupSpec(cfg.head_node_config.ray_start_params) if cfg.head_node_config else None | |||
HeadGroupSpec(cfg.head_node_config.ray_start_params, cfg.head_node_config.k8s_pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Jason Parraga <[email protected]>
Tracking issue
Related to flyteorg/flyte#5666
Why are the changes needed?
These changes update the flytekit-ray package such that users can configure pod specs for worker and head workloads.
What changes were proposed in this pull request?
Updating the version of flyteidl and plumb k8s pods through worker and head config.
How was this patch tested?
See unit tests.
Check all the applicable boxes
Related PRs
flyteorg/flyte#5933