You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This task uses pandas.Dataframe.groupby internally to apply the groupers defined by #72, and split an input dataframe into groups for processing.
It may be easiest to do this together with #72 since these features are closely linked.
Depending on the setting (local, cloud) and/or workflow configuration, this task may emit either actual dataframe groups, or possibly references (urls) to serialized groups (i.e. hive-partitioned parquet). The latter is demonstrated in #45.
We should also leave room for the eventual scenario in which the data may be live in a backed that supports SQL queries (BigQuery, DuckDB, etc.), in which case the groupby may not need to happen in-memory. This is not relevant to our immediate concerns but, if there's a way to do so that doesn't take much extra time, it may be useful to make a nod to this idea somewhere in this implementation (a NotImplemented code path, etc.) so we keep it in mind.
The text was updated successfully, but these errors were encountered:
This task uses pandas.Dataframe.groupby internally to apply the groupers defined by #72, and split an input dataframe into groups for processing.
It may be easiest to do this together with #72 since these features are closely linked.
Depending on the setting (local, cloud) and/or workflow configuration, this task may emit either actual dataframe groups, or possibly references (urls) to serialized groups (i.e. hive-partitioned parquet). The latter is demonstrated in #45.
We should also leave room for the eventual scenario in which the data may be live in a backed that supports SQL queries (BigQuery, DuckDB, etc.), in which case the groupby may not need to happen in-memory. This is not relevant to our immediate concerns but, if there's a way to do so that doesn't take much extra time, it may be useful to make a nod to this idea somewhere in this implementation (a NotImplemented code path, etc.) so we keep it in mind.
The text was updated successfully, but these errors were encountered: