-
Notifications
You must be signed in to change notification settings - Fork 163
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[FEAT] DataFrame.__iter__() and .iter_partitions() (#1062)
Implements iteration, via streaming execution, over DataFrames: - DataFrame.__iter__() returns an iterator of rows. Each row is a pydict of the form `{"colname": value }`. - DataFrame.iter_partitions() returns an iterator of partitions. Each partition is a `daft.Table` object. Execution semantics: - Results are returned as soon as they become available. - Current behaviours (not technical restrictions, we can change these if we want): - PyRunner: Execution pauses between calls to `iterator.next()`. - RayRunner: Execution continues in the background. Implementation details: - Adds new interfaces to Runner: - `run_iter() -> Iterator[PartitionT]` and - `run_iter_tables() -> Iterator[Table]` - in addition to the existing `Runner.run() -> PartitionCacheEntry`. This isn't super clean - ideally we go through a single point of abstraction (PartitionCache) for translating between PartitionT and Table. But we may rewrite a lot of this soon anyway, and for now it is a bit dangerous to shoehorn single-partition behaviour into a PartitionSet. - `run_iter()` is now the new narrow waist. All execution, even `df.collect()`, now happens through streaming execution. --------- Co-authored-by: Xiayue Charles Lin <[email protected]>
- Loading branch information
1 parent
1c41ad8
commit eb7e2c8
Showing
5 changed files
with
251 additions
and
22 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.