Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](mtmv) Fix select literal result wrongly in group by when use materialized view #38958

Merged
merged 6 commits into from
Aug 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ protected LogicalAggregate<Plan> doRewriteQueryByView(
LogicalAggregate<Plan> queryAggregate = queryTopPlanAndAggPair.value();
List<Expression> queryGroupByExpressions = queryAggregate.getGroupByExpressions();
// handle the scene that query top plan not use the group by in query bottom aggregate
if (queryGroupByExpressions.size() != queryTopPlanGroupBySet.size()) {
if (needCompensateGroupBy(queryTopPlanGroupBySet, queryGroupByExpressions)) {
for (Expression expression : queryGroupByExpressions) {
if (queryTopPlanGroupBySet.contains(expression)) {
continue;
Expand Down Expand Up @@ -266,6 +266,42 @@ protected LogicalAggregate<Plan> doRewriteQueryByView(
return new LogicalAggregate<>(finalGroupExpressions, finalOutputExpressions, tempRewritedPlan);
}

/**
* handle the scene that query top plan not use the group by in query bottom aggregate
* If mv is select o_orderdate from orders group by o_orderdate;
* query is select 1 from orders group by o_orderdate.
* Or mv is select o_orderdate from orders group by o_orderdate
* query is select o_orderdate from orders group by o_orderdate, o_orderkey;
* if the slot which query top project use can not cover the slot which query bottom aggregate group by slot
* should compensate group by to make sure the data is right.
* For example:
* mv is select o_orderdate from orders group by o_orderdate;
* query is select o_orderdate from orders group by o_orderdate, o_orderkey;
*
* @param queryGroupByExpressions query bottom aggregate group by is o_orderdate, o_orderkey
* @param queryTopProject query top project is o_orderdate
* @return need to compensate group by if true or not need
*
*/
private static boolean needCompensateGroupBy(Set<? extends Expression> queryTopProject,
List<Expression> queryGroupByExpressions) {
Set<Expression> queryGroupByExpressionSet = new HashSet<>(queryGroupByExpressions);
if (queryGroupByExpressionSet.size() != queryTopProject.size()) {
return true;
}
Set<NamedExpression> queryTopPlanGroupByUseNamedExpressions = new HashSet<>();
Set<NamedExpression> queryGroupByUseNamedExpressions = new HashSet<>();
for (Expression expr : queryTopProject) {
queryTopPlanGroupByUseNamedExpressions.addAll(expr.collect(NamedExpression.class::isInstance));
}
for (Expression expr : queryGroupByExpressionSet) {
queryGroupByUseNamedExpressions.addAll(expr.collect(NamedExpression.class::isInstance));
}
// if the slots query top project use can not cover the slots which query bottom aggregate use
// Should compensate.
return !queryTopPlanGroupByUseNamedExpressions.containsAll(queryGroupByUseNamedExpressions);
}

/**
* Try to rewrite query expression by view, contains both group by dimension and aggregate function
*/
Expand Down Expand Up @@ -435,7 +471,12 @@ private static boolean isGroupByEqualsAfterEqualFilterEliminate(

/**
* Check group by is equal or not after group by eliminate by functional dependency
* Such as query group by expression is (l_orderdate#1, l_supperkey#2)
* Such as query is select l_orderdate, l_supperkey, count(*) from table group by l_orderdate, l_supperkey;
* materialized view is select l_orderdate, l_supperkey, l_partkey count(*) from table
* group by l_orderdate, l_supperkey, l_partkey;
* Would check the extra l_partkey is can be eliminated by functional dependency.
* The process step and data is as following:
* group by expression is (l_orderdate#1, l_supperkey#2)
* materialized view is group by expression is (l_orderdate#4, l_supperkey#5, l_partkey#6)
* materialized view expression mapping is
* {l_orderdate#4:l_orderdate#10, l_supperkey#5:l_supperkey#11, l_partkey#6:l_partkey#12}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -102,10 +102,10 @@ PhysicalResultSink
--hashAgg[GLOBAL]
----hashAgg[LOCAL]
------hashJoin[INNER_JOIN] hashCondition=((t1.l_orderkey = orders.o_orderkey) and (t1.l_shipdate = orders.o_orderdate)) otherCondition=()
--------filter((orders.o_orderdate = '2023-12-09') and (orders.o_shippriority = 1) and (orders.o_totalprice = 11.50))
----------PhysicalOlapScan[orders]
--------filter((t1.l_shipdate = '2023-12-09'))
----------PhysicalOlapScan[lineitem]
--------filter((orders.o_orderdate = '2023-12-09') and (orders.o_shippriority = 1) and (orders.o_totalprice = 11.50))
----------PhysicalOlapScan[orders]

-- !query7_1_after --
yy 4 11.50 11.50 11.50 1
Expand Down
Loading
Loading