-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improvement](partial-update) make partial-update on agg table use less memory #32200
base: master
Are you sure you want to change the base?
Conversation
Thank you for your contribution to Apache Doris. |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TeamCity be ut coverage result: |
@@ -1486,6 +1488,9 @@ public void setEnableLeftZigZag(boolean enableLeftZigZag) { | |||
@VariableMgr.VarAttr(name = ENABLE_UNIQUE_KEY_PARTIAL_UPDATE, needForward = true) | |||
public boolean enableUniqueKeyPartialUpdate = false; | |||
|
|||
@VariableMgr.VarAttr(name = ENABLE_AGG_KEY_PARTIAL_UPDATE, needForward = true) | |||
public boolean enableAggregateKeyPartialUpdate = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's different between enableUniqueKeyPartialUpdate
and enableAggregateKeyPartialUpdate
? why not use one variable to control them?
In LogicalPlanBuilder
and BindSink
, we cannot distinguish them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For users, this is a new feature that should be controlled by a separate variable. If only use one variable to control them, may be the variable ENABLE_UNIQUE_KEY_PARTIAL_UPDATE should be change to ENABLE_PARTIAL_UPDATE, our documentation needs to be updated accordingly, and it may cause compatibility issues for users as their codes also need to be changed accordingly.
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TeamCity be ut coverage result: |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 37909 ms
|
TPC-DS: Total hot run time: 181604 ms
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
TeamCity be ut coverage result: |
clang-tidy review says "All clean, LGTM! 👍" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add regression cases for your patch
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
@zhannngchen update the patch, take a look please |
TPC-H: Total hot run time: 37847 ms
|
TPC-DS: Total hot run time: 182329 ms
|
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TeamCity be ut coverage result: |
TPC-H: Total hot run time: 38826 ms
|
TPC-DS: Total hot run time: 183192 ms
|
ClickBench: Total hot run time: 30.42 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TeamCity be ut coverage result: |
TPC-H: Total hot run time: 38566 ms
|
TPC-DS: Total hot run time: 181770 ms
|
ClickBench: Total hot run time: 29.27 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
...ion-test/suites/unique_with_mow_p0/partial_update/test_agg_partial_update_insert_into.groovy
Show resolved
Hide resolved
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TeamCity be ut coverage result: |
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 40477 ms
|
TPC-DS: Total hot run time: 172832 ms
|
ClickBench: Total hot run time: 30.9 s
|
Proposed changes
This pr make fe only send the specific column to be when using partial_columns= true header in streamload or set enable_agg_key_partial_update=true when using insert into on agg table, which can use less memory, in my test, it can reduce about 60% memory when insert 1000 columns table to 5000 columns table.
Issue Number: close #xxx
Further comments
If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...