-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CH] AQE cannot coalesce partitions when Exchange hash partitioning exists non-attribute expressions #3486
Comments
@exmy Thanks for noticing this issue. Could you explain a bit how large this issue will affect the performance? |
cc @PHILO-HE |
We've recently noticed that this issue will also prevent AQE's OptimizeSkewedJoin rule from being effective, and it has a significant impact on performance. If OptimizeSkewedJoin disabled due to this issue: |
@exmy Thanks for the profiling. On which query can we reproduce the performance result? Does that mean we shouldn't add extra project before exchange? |
Sorry, this issue is specific to the CH backend. It's not present in the Velox backend. |
Velox backend doesn't has this issue. Because it adds pre-project to calculate hash value but doesn't change shuffle hash expressions in ShuffleExchange operator. I have tested it. @rui-mo |
@exmy Thanks for checking! |
Backend
CH (ClickHouse)
Bug description
A pre-project operator is added before the exchange operator when hash partitioning involves non-attribute expressions, which results in that AQE cannot coalesce shuffle partitions.
Spark version
None
Spark configurations
No response
System information
No response
Relevant logs
No response
The text was updated successfully, but these errors were encountered: