diff --git a/docs/user-guide/lazy/execution.md b/docs/user-guide/lazy/execution.md index 894c181ef..47ce80235 100644 --- a/docs/user-guide/lazy/execution.md +++ b/docs/user-guide/lazy/execution.md @@ -38,6 +38,10 @@ Above we see that from the 10 million rows there are 14,029 rows that match our With the default `collect` method Polars processes all of your data as one batch. This means that all the data has to fit into your available memory at the point of peak memory usage in your query. +!!! warning "Reusing `LazyFrame` objects" + + Remember that `LazyFrame`s are query plans i.e. a promise on computation. This means that every time you reuse it in separate downstream queries after it is defined, it is computed all over again. If you define an operation on a `LazyFrame` that doesn't maintain row order (such as a `groupby`), then the order will also change every time it is run. To avoid this, use `maintain_order=True` arguments for such operations. + ### Execution on larger-than-memory data If your data requires more memory than you have available Polars may be able to process the data in batches using *streaming* mode. To use streaming mode you simply pass the `streaming=True` argument to `collect`