-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improvement](statistics)Reduce partition column sample BE memory consumption. (#41203) #41359
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
TPC-H: Total hot run time: 49003 ms
|
TPC-DS: Total hot run time: 212359 ms
|
ClickBench: Total hot run time: 30.47 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
e4a4b2f
to
79700f3
Compare
run buildall |
TPC-H: Total hot run time: 48916 ms
|
TPC-DS: Total hot run time: 211911 ms
|
ClickBench: Total hot run time: 31.3 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
run external |
4 similar comments
run external |
run external |
run external |
run external |
…sumption. (apache#41203) For string type columns, use xxhash_64 to transfer column value to an integer, and then calculate the NDV based on the integer hash value. In this case, we can reduce the memory cost of sample analyze and improve the performance. For example, l_comment column of TPCH 100G lineitem table. The memory cost to calculate its NDV is reduced to 8GB from 22GB
79700f3
to
2790511
Compare
run buildall |
TPC-H: Total hot run time: 49176 ms
|
TPC-DS: Total hot run time: 211951 ms
|
ClickBench: Total hot run time: 30.9 s
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
run p0 |
backport: #41203