Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](hash_join) uninited hash table probe caused by short circuit #32901

Merged
merged 1 commit into from
Mar 27, 2024

Conversation

mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented Mar 27, 2024

Proposed changes

F0325 16:25:38.738495 375719 vhash_join_node.cpp:659] FATAL: uninited hash table probe
*** Check failure stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:417
 1# 0x00007F990B380B50 in /lib64/libc.so.6
 2# __GI_raise in /lib64/libc.so.6
 3# __GI_abort in /lib64/libc.so.6
 4# 0x0000559E725FC5B9 in /root/doris/apache-doris-2.0-bin-x64/be/lib/doris_be
 5# 0x0000559E725F1BCD in /root/doris/apache-doris-2.0-bin-x64/be/lib/doris_be
 6# google::LogMessage::SendToLog() in /root/doris/apache-doris-2.0-bin-x64/be/lib/doris_be
 7# google::LogMessage::Flush() in /root/doris/apache-doris-2.0-bin-x64/be/lib/doris_be
 8# google::LogMessageFatal::~LogMessageFatal() in /root/doris/apache-doris-2.0-bin-x64/be/lib/doris_be
 9# _ZNSt8__detail9__variant17__gen_vtable_implINS0_12_Multi_arrayIPFNS0_21__deduce_visit_resultIvEEOZN5doris10vectorized12HashJoinNode4pullEPNS5_12RuntimeStateEPNS6_5BlockEPbE3$_0RSt7variantIJSt9monostateNS6_26SerializedHashTableContextINS6_10RowRefListEEENS6_27PrimaryTypeHashTableContextIhSI_EENSK_ItSI_EENSK_IjSI_EENSK_ImSI_EENSK_INS6_7UInt128ESI_EENSK_INS6_7UInt256ESI_EENS6_24FixedKeyHashTableContextImLb1ESI_EENST_ImLb0ESI_EENST_ISP_Lb1ESI_EENST_ISP_Lb0ESI_EENST_ISR_Lb1ESI_EENST_ISR_Lb0ESI_EENSH_INS6_18RowRefListWithFlagEEENSK_IhS10_EENSK_ItS10_EENSK_IjS10_EENSK_ImS10_EENSK_ISP_S10_EENSK_ISR_S10_EENST_ImLb1ES10_EENST_ImLb0ES10_EENST_ISP_Lb1ES10_EENST_ISP_Lb0ES10_EENST_ISR_Lb1ES10_EENST_ISR_Lb0ES10_EENSH_INS6_19RowRefListWithFlagsEEENSK_IhS1E_EENSK_ItS1E_EENSK_IjS1E_EENSK_ImS1E_EENSK_ISP_S1E_EENSK_ISR_S1E_EENST_ImLb1ES1E_EENST_ImLb0ES1E_EENST_ISP_Lb1ES1E_EENST_ISP_Lb0ES1E_EENST_ISR_Lb1ES1E_EENST_ISR_Lb0ES1E_EEEERSF_IJSG_NS6_21ProcessHashTableProbeILi0EEENS1U_ILi2EEENS1U_ILi8EEENS1U_ILi1EEENS1U_ILi4EEENS1U_ILi3EEENS1U_ILi5EEENS1U_ILi7EEENS1U_ILi9EEENS1U_ILi10EEEEEOSF_IJSt17integral_constantIbLb0EES27_IbLb1EEEES2B_EJEEESt16integer_sequenceImJLm1ELm0ELm1ELm1EEEE14__visit_invokeESE_S1T_S26_S2B_S2B_ at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1013
10# doris::vectorized::HashJoinNode::pull(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/join/vhash_join_node.cpp:628
11# doris::vectorized::HashJoinNode::get_next(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/vec/exec/join/vhash_join_node.cpp:796
12# std::_Function_handler<doris::Status (doris::RuntimeState*, doris::vectorized::Block*, bool*), std::_Bind<doris::Status (doris::ExecNode::*(doris::ExecNode*, std::_Placeholder<1>, std::_Placeholder<2>, std::_Placeholder<3>))(doris::RuntimeState*, doris::vectorized::Block*, bool*)> >::_M_invoke(std::_Any_data const&, doris::RuntimeState*&&, doris::vectorized::Block*&&, bool*&&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
13# doris::ExecNode::get_next_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*, std::function<doris::Status (doris::RuntimeState*, doris::vectorized::Block*, bool*)> const&, bool) at /root/doris/be/src/exec/exec_node.cpp:590
14# doris::PlanFragmentExecutor::get_vectorized_internal(doris::vectorized::Block*, bool*) at /root/doris/be/src/runtime/plan_fragment_executor.cpp:353
15# doris::PlanFragmentExecutor::open_vectorized_internal() in /root/doris/apache-doris-2.0-bin-x64/be/lib/doris_be
16# doris::PlanFragmentExecutor::open() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:262
17# doris::FragmentExecState::execute() at /root/doris/be/src/runtime/fragment_mgr.cpp:265
18# doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::RuntimeState*, doris::Status*)> const&) at /root/doris/be/src/runtime/fragment_mgr.cpp:538
19# std::_Function_handler<void (), doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
20# doris::ThreadPool::dispatch_thread() in /root/doris/apache-doris-2.0-bin-x64/be/lib/doris_be
21# doris::Thread::supervise_thread(void*) at /root/doris/be/src/util/thread.cpp:499
22# start_thread in /lib64/libpthread.so.0
23# __clone in /lib64/libc.so.6

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@mrhhsg
Copy link
Member Author

mrhhsg commented Mar 27, 2024

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.80% (8046/21285)
Line Coverage: 29.47% (65714/223016)
Region Coverage: 28.93% (33815/116903)
Branch Coverage: 24.78% (17365/70066)
Coverage Report: http://coverage.selectdb-in.cc/coverage/d96c0af9e72d0eae79999d837b664974c90b93d7_d96c0af9e72d0eae79999d837b664974c90b93d7/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 50314 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d96c0af9e72d0eae79999d837b664974c90b93d7, data reload: false

------ Round 1 ----------------------------------
q1	17638	4458	4417	4417
q2	2075	152	150	150
q3	10404	1941	1940	1940
q4	10178	1299	1330	1299
q5	8433	3949	4006	3949
q6	231	123	123	123
q7	2061	1593	1602	1593
q8	9308	2756	2744	2744
q9	11037	10661	10530	10530
q10	8662	3505	3535	3505
q11	418	241	244	241
q12	466	302	305	302
q13	18338	3987	4045	3987
q14	351	333	333	333
q15	499	451	462	451
q16	682	594	591	591
q17	1146	1006	998	998
q18	7279	6844	6771	6771
q19	1695	1581	1570	1570
q20	548	309	326	309
q21	4518	4142	4116	4116
q22	498	402	395	395
Total cold run time: 116465 ms
Total hot run time: 50314 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4355	4308	4288	4288
q2	315	226	222	222
q3	4153	4120	4116	4116
q4	2763	2739	2764	2739
q5	7286	7201	7227	7201
q6	239	125	116	116
q7	3258	2853	2834	2834
q8	4348	4435	4442	4435
q9	17144	17078	17103	17078
q10	4260	4265	4282	4265
q11	745	681	671	671
q12	1017	853	849	849
q13	6827	3748	3720	3720
q14	449	410	432	410
q15	505	460	450	450
q16	748	735	691	691
q17	3781	3745	3836	3745
q18	8912	8764	8739	8739
q19	1719	1709	1653	1653
q20	2408	2197	2137	2137
q21	8400	8561	8487	8487
q22	1014	956	952	952
Total cold run time: 84646 ms
Total hot run time: 79798 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 200644 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d96c0af9e72d0eae79999d837b664974c90b93d7, data reload: false

query1	923	399	377	377
query2	6520	2159	2055	2055
query3	6918	201	205	201
query4	20185	18114	18027	18027
query5	19715	6497	6515	6497
query6	267	211	225	211
query7	4179	295	296	295
query8	262	267	252	252
query9	3174	2707	2645	2645
query10	411	290	311	290
query11	11251	10705	10629	10629
query12	115	79	70	70
query13	5587	630	626	626
query14	17674	13301	13434	13301
query15	356	227	240	227
query16	6471	267	255	255
query17	1741	1451	858	858
query18	2326	409	411	409
query19	199	141	148	141
query20	73	77	73	73
query21	186	90	94	90
query22	5172	4908	4955	4908
query23	32607	31787	31845	31787
query24	6984	6520	6485	6485
query25	519	438	408	408
query26	523	160	152	152
query27	1901	299	301	299
query28	6058	2261	2250	2250
query29	2891	2796	2856	2796
query30	238	159	165	159
query31	893	735	770	735
query32	67	61	60	60
query33	402	263	243	243
query34	866	471	486	471
query35	1109	940	883	883
query36	1326	1253	1107	1107
query37	85	62	60	60
query38	3065	2867	2921	2867
query39	1366	1310	1322	1310
query40	192	92	93	92
query41	34	32	31	31
query42	80	81	87	81
query43	658	638	633	633
query44	1122	702	720	702
query45	238	228	225	225
query46	1226	985	984	984
query47	1738	1676	1799	1676
query48	977	686	671	671
query49	611	360	350	350
query50	870	647	601	601
query51	4798	4665	4704	4665
query52	81	77	75	75
query53	446	324	334	324
query54	2632	2465	2450	2450
query55	87	72	77	72
query56	211	197	199	197
query57	1345	1075	1098	1075
query58	207	197	189	189
query59	3573	3250	3469	3250
query60	203	180	204	180
query61	82	78	81	78
query62	893	513	538	513
query63	481	349	338	338
query64	2348	1485	1445	1445
query65	3622	3579	3543	3543
query66	820	372	358	358
query67	16220	15250	16498	15250
query68	8989	669	676	669
query69	558	331	359	331
query70	1570	1482	1306	1306
query71	424	290	307	290
query72	6414	3387	3388	3387
query73	738	315	317	315
query74	6211	5919	5843	5843
query75	5361	3675	3751	3675
query76	5607	1153	1239	1153
query77	949	247	245	245
query78	12559	11347	11478	11347
query79	10196	664	651	651
query80	1451	378	382	378
query81	488	233	237	233
query82	1695	100	95	95
query83	170	141	125	125
query84	257	66	68	66
query85	845	276	274	274
query86	325	283	294	283
query87	3199	2986	3024	2986
query88	5112	2289	2296	2289
query89	477	282	298	282
query90	1971	210	204	204
query91	154	116	126	116
query92	60	52	56	52
query93	7170	553	583	553
query94	692	199	204	199
query95	1107	1062	1044	1044
query96	629	331	322	322
query97	6430	6372	6315	6315
query98	185	166	167	166
query99	2954	928	893	893
Total cold run time: 315347 ms
Total hot run time: 200644 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.32 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d96c0af9e72d0eae79999d837b664974c90b93d7, data reload: false

query1	0.02	0.02	0.02
query2	0.07	0.02	0.02
query3	0.24	0.05	0.04
query4	1.79	0.08	0.08
query5	0.52	0.52	0.53
query6	1.24	0.63	0.61
query7	0.02	0.01	0.00
query8	0.03	0.02	0.03
query9	0.51	0.47	0.48
query10	0.54	0.53	0.52
query11	0.12	0.09	0.08
query12	0.11	0.08	0.08
query13	0.62	0.61	0.62
query14	0.78	0.79	0.76
query15	0.79	0.76	0.76
query16	0.36	0.35	0.36
query17	1.01	1.00	0.96
query18	0.21	0.27	0.24
query19	1.93	1.76	1.83
query20	0.01	0.01	0.01
query21	15.47	0.58	0.55
query22	2.08	2.43	1.51
query23	17.33	1.00	0.98
query24	3.83	1.02	2.91
query25	0.38	0.05	0.08
query26	0.65	0.16	0.15
query27	0.04	0.03	0.04
query28	7.35	0.72	0.70
query29	12.61	2.10	2.36
query30	0.56	0.53	0.55
query31	2.82	0.39	0.38
query32	3.37	0.50	0.49
query33	3.10	3.07	3.07
query34	15.26	4.78	4.79
query35	4.84	4.85	4.82
query36	1.04	1.00	1.03
query37	0.06	0.04	0.05
query38	0.04	0.02	0.03
query39	0.02	0.01	0.01
query40	0.16	0.14	0.14
query41	0.07	0.01	0.01
query42	0.02	0.02	0.01
query43	0.03	0.02	0.02
Total cold run time: 102.05 s
Total hot run time: 30.32 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit d96c0af9e72d0eae79999d837b664974c90b93d7 with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       20.9 seconds inserted 10000000 Rows, about 478K ops/s

@yiguolei yiguolei merged commit d933046 into apache:branch-2.0 Mar 27, 2024
23 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants