Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](counters) fix MemoryUsage and PeakMemoryUsage counters of some operators #41602

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jacktengg
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@jacktengg
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -425,8 +424,7 @@ PipelineXSinkLocalStateBase::PipelineXSinkLocalStateBase(DataSinkOperatorXBase*
}

PipelineXLocalStateBase::PipelineXLocalStateBase(RuntimeState* state, OperatorXBase* parent)
: _num_rows_returned(0),
_rows_returned_counter(nullptr),
: _rows_returned_counter(nullptr),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: member initializer for '_rows_returned_counter' is redundant [modernize-use-default-member-init]

Suggested change
: _rows_returned_counter(nullptr),
: ,

@jacktengg jacktengg force-pushed the fix-1009-counters branch 2 times, most recently from 69f6048 to 3fe3a78 Compare October 9, 2024 08:57
@jacktengg
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.27% (9631/25839)
Line Coverage: 28.67% (79885/278658)
Region Coverage: 28.10% (41285/146945)
Branch Coverage: 24.72% (21034/85098)
Coverage Report: http://coverage.selectdb-in.cc/coverage/3fe3a78a697e6c02916f02d62d419cc25696226f_3fe3a78a697e6c02916f02d62d419cc25696226f/report/index.html

data.get_buffer_size_in_bytes() -
Base::_shared_state->mem_usage_record.used_in_state);
Base::_shared_state->mem_usage_record.used_in_state;
Base::_mem_tracker->consume(arena_memory_usage);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base 里的这个memtracker 还有用么?

Copy link
Contributor Author

@jacktengg jacktengg Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有了MemoryUsageMemoryUsagePeak这两个counter,这个_mem_tracker感觉可以删掉了。

local_state._memory_used_counter->set(local_state._mem_tracker->consumption());
local_state._peak_memory_usage_counter->set(
local_state._mem_tracker->peak_consumption());
COUNTER_SET(local_state._build_blocks_memory_usage, (int64_t)new_block_mem_usage);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

至少跟580 行一致

@@ -386,6 +389,8 @@ Status ExchangeSinkBuffer::_send_rpc(InstanceLoId id) {
}
}
if (request.block_holder->get_block()) {
_parent->memory_used_counter()->update(
-request.block_holder->get_block()->ByteSizeLong());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broadcast 的时候,是shared ptr,一个block 在多个channel 共享的,你这么统计,结果很大。
先别统计broadcast的了。

@@ -166,6 +166,7 @@ Status ExchangeSinkBuffer::add_block(TransmitInfo&& request) {
RETURN_IF_ERROR(
BeExecVersionManager::check_be_exec_version(request.block->be_exec_version()));
}
_parent->memory_used_counter()->update(request.block->ByteSizeLong());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我们 exchange sink operator:

  1. 先只统计非broad cast 方式的内存,统计方式跟现在这个PR的代码一样。
  2. channel 内保存的那一个block的内存。 broadcast方式的正好没有。
  3. 对于broadcast 方式的内存,可能是通过那个holder,构造和析构函数里计算,或者直接使用holder limiter的那个值。

@jacktengg jacktengg force-pushed the fix-1009-counters branch 3 times, most recently from 0b594f4 to 1255d5a Compare October 9, 2024 17:45
@jacktengg
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.29% (9636/25838)
Line Coverage: 28.67% (79883/278608)
Region Coverage: 28.08% (41312/147110)
Branch Coverage: 24.70% (21037/85154)
Coverage Report: http://coverage.selectdb-in.cc/coverage/1255d5a76e37f24d0201d9497666cf97c1bc00c4_1255d5a76e37f24d0201d9497666cf97c1bc00c4/report/index.html

@jacktengg jacktengg force-pushed the fix-1009-counters branch 2 times, most recently from 275528a to ac364bb Compare October 10, 2024 01:28
@jacktengg
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.29% (9636/25838)
Line Coverage: 28.69% (79921/278609)
Region Coverage: 28.08% (41314/147111)
Branch Coverage: 24.71% (21041/85154)
Coverage Report: http://coverage.selectdb-in.cc/coverage/ac364bb5ecbb954f335bb8c10cb0b54d590120a9_ac364bb5ecbb954f335bb8c10cb0b54d590120a9/report/index.html

@jacktengg
Copy link
Contributor Author

run p0

Base::_mem_tracker->consume(arena_memory_usage);
_serialize_key_arena_memory_usage->add(arena_memory_usage);
Base::_shared_state->mem_usage_record.used_in_arena = _agg_arena_pool->size();
int64_t arena_memory_usage = _agg_arena_pool->size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lost agg hash table?

@@ -109,7 +109,7 @@ class AggSinkLocalState : public PipelineXSinkLocalState<AggSharedState> {
RuntimeProfile::Counter* _max_row_size_counter = nullptr;
RuntimeProfile::Counter* _hash_table_memory_usage = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This counter is not updated?

@@ -458,8 +455,11 @@ Status ExchangeSinkOperatorX::sink(RuntimeState* state, vectorized::Block* block
auto rows = block->rows();
{
SCOPED_TIMER(local_state._split_block_hash_compute_timer);
RETURN_IF_ERROR(
local_state._partitioner->do_partitioning(state, block, _mem_tracker.get()));
RETURN_IF_ERROR(local_state._partitioner->do_partitioning(state, block, nullptr));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add a nullptr not remove this parameter?

@@ -345,9 +345,8 @@ void PipelineXLocalStateBase::reached_limit(vectorized::Block* block, bool* eos)
});

if (auto rows = block->rows()) {
_num_rows_returned += rows;
COUNTER_UPDATE(_rows_returned_counter, rows);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里统一给加了,然后每个operator 内部也在加,难道没有错吗?

@@ -460,18 +460,16 @@ void AggLocalState::do_agg_limit(vectorized::Block* block, bool* eos) {
if (_shared_state->do_sort_limit && _shared_state->do_limit_filter(block, block->rows())) {
vectorized::Block::filter_block_internal(block, _shared_state->need_computes);
if (auto rows = block->rows()) {
_num_rows_returned += rows;
add_num_rows_returned(rows);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里也加,然后base 里也加?

@@ -161,7 +161,7 @@ Status PartitionedHashJoinSinkLocalState::_revoke_unpartitioned_block(RuntimeSta

{
SCOPED_TIMER(_partition_timer);
(void)_partitioner->do_partitioning(state, &sub_block, _mem_tracker.get());
(void)_partitioner->do_partitioning(state, &sub_block, nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的nullptr 有啥意义吗?

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jacktengg
Copy link
Contributor Author

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 10, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.40% (9659/25827)
Line Coverage: 28.67% (80126/279478)
Region Coverage: 28.09% (41425/147451)
Branch Coverage: 24.70% (21099/85418)
Coverage Report: http://coverage.selectdb-in.cc/coverage/f4bd3d8ff1bfbe778bd3aa4facf3cb924bae4669_f4bd3d8ff1bfbe778bd3aa4facf3cb924bae4669/report/index.html

@jacktengg
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.41% (9661/25827)
Line Coverage: 28.68% (80168/279478)
Region Coverage: 28.10% (41439/147449)
Branch Coverage: 24.71% (21109/85416)
Coverage Report: http://coverage.selectdb-in.cc/coverage/f4bd3d8ff1bfbe778bd3aa4facf3cb924bae4669_f4bd3d8ff1bfbe778bd3aa4facf3cb924bae4669/report/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants