-
Hello, I'm new on Flyte, but at the moment I try to figure out if Flyte is the right tool for my use-case. I would like to describe my use-case and I hope I get some information to make a decision: My envrionment is
Use-Case
posted by @flashpixx in slack |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
@flashpixx, Firstly thank you for considering Flyte and such a detailed question. I will try to answer it by answering your subquestions and then dive into details for anything else.
This is perfectly valid and what Flyte is designed for. FlyteConsole uses the Flyte Control plane APIs to visualize an execution. Everything in Flyte is API First and controlled by an API. All registrations are done through an API, all executions can be invoked from the API and all visualization of execution status are driven by the API. These APIs are pretty stable and we do not make backwards incompatible changes, barring a major version release. Even when we do release
You should be able to build a UI on top of the API, infact some of our users have already done that like Striveworks and Latch.bio
You can always customize the amount of CPU/Mem required for t task
Flyte has a type system that is greatly suited for unstructured datasets
Absolutely, we actually have no-code workflows created on top of Flyte. you can create a set of tasks that can be easily stitched together into a workflow. The only thing your client needs to do is
Once registered a workflow is immutable. you can mutate it, but only new versions will be registered. For launching this workflow+version combination is used. We will perform type checking etc at the API entrypoint
This is what most tasks in Flyte do. Depending on the size, you can run it as a single node job and depending on the size of your cluster and the max machine configuration, you can have one pod run with almost 1TB ram and 60+ cores (check AWS/GCP) large machine sizes. Flytekit (python SDK) offers simplified API for data handling. here and even for Schema types (backed by parquet).
This is what Flyte is designed for. Every execution is completely isolated. Infact every task can use a different version of Spark and different version of Sedona. This is because executions are containerized. If using Spark on Kubernetes, this is even more seamless and Flyte will manage everything for you - as shown here. The clusters are called as Ephemeral Spark clusters and the version of spark is actuallly defined by the user, as the users container is converted to a spark runner.
By default Flyte starts in multi-namespace mode, where each namspace is
Absolutely
Raw container tasks, manage the data marshalling etc, but no need to install python, java etc in your container. Conclusion |
Beta Was this translation helpful? Give feedback.
@flashpixx, Firstly thank you for considering Flyte and such a detailed question. I will try to answer it by answering your subquestions and then dive into details for anything else.
This is perfectly valid and what Flyte is designed for. FlyteConsole uses the Flyte Control plane APIs to visualize an execution. Everything in Flyte is API First and controlled by an API. All registrations are done through an API, all executions can be invoked from the API…