PERF: Add explicit query compiler method for len/shape checks #7397
Labels
Interfaces and abstractions
Issues with Modin's QueryCompiler, Algebra, or BaseIO objects
new feature/request 💬
Requests and pull requests for new features
Performance 🚀
Performance related issues and pull requests.
Is your feature request related to a problem? Please describe.
Currently, calling
len(pd.DataFrame(...))
will materialize the frame's index and compute its length.Some storage formats (including pandas, via the
PandasDataFrame
object) have more efficient ways, or built-in caching mechanisms, for computing the dimensions of a frame. Adding an explicit query compiler method (get_axis_len(axis: [0, 1]) -> int
) would let us take advantage of this. Accordingly, calls tolen(self.index)
in frontend code should be replaced withlen(self)
, and calls tolen(self.columns)
withself._query_compiler.get_axis_length(1)
to avoid unnecessary materialization.The text was updated successfully, but these errors were encountered: