Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runaway: Add processed_keys and request_ru to check #55147

Merged
merged 16 commits into from
Sep 24, 2024

Conversation

HuSharp
Copy link
Contributor

@HuSharp HuSharp commented Aug 2, 2024

What problem does this PR solve?

Issue Number: ref #54434, merged after pingcap/kvproto#1261 (review)

Problem Summary:

What changed and how does it work?

supports PROCESSED_KEYS and REQUEST_UNIT to check runaway

Explain

Can check doc https://github.com/pingcap/docs/pull/18858/files

Processed_keys: The number of keys that Coprocessor has processed. Compared with total_keys, processed_keys does not include the old versions of MVCC. A great difference between processed_keys and total_keys indicates that many old versions exist.

can check in Process_keys of slow query log

REQUEST_UNIT: The number of request units produced by the query. It's the total number of Request_unit_read and Request_unit_write.

can check in REQUEST_UNIT of statement fields desciption

Display

mysql> ALTER RESOURCE GROUP default QUERY_LIMIT=(EXEC_ELAPSED='2s', PROCESSED_KEYS=100, ACTION=KILL,  WATCH=EXACT DURATION='10m');
Query OK, 0 rows affected (0.06 sec)

mysql> SELECT * FROM information_schema.resource_groups;
+---------+------------+----------+-----------+--------------------------------------------------------------------------------+------------+
| NAME    | RU_PER_SEC | PRIORITY | BURSTABLE | QUERY_LIMIT                                                                    | BACKGROUND |
+---------+------------+----------+-----------+--------------------------------------------------------------------------------+------------+
| default | UNLIMITED  | MEDIUM   | YES       | EXEC_ELAPSED='2s', PROCESSED_KEYS=100, ACTION=KILL, WATCH=EXACT DURATION='10m0s' | NULL       |
+---------+------------+----------+-----------+--------------------------------------------------------------------------------+------------+
1 row in set (0.00 sec)

can check in log

[ERROR] [domain.go:1908] ["LoadSysVarCacheLoop failed"] [error="[executor:8253]Query execution was interrupted, identified as runaway query"]

check query result

mysql> select * from mysql.tidb_runaway_watch limit 3 \G;
*************************** 1. row ***************************
                 id: 2
resource_group_name: default
         start_time: 2024-09-06 08:54:01.886568
           end_time: 2024-09-06 09:04:01.886568
              watch: 1
         watch_text: SELECT variable_name, variable_value FROM mysql.global_variables
             source: 127.0.0.1:4000
             action: 3
  switch_group_name:
               rule: ProcessedKeys = 666(10)

mysql> SELECT * FROM mysql.tidb_RUNAWAY_queries limit 10 \G;
*************************** 1. row ***************************
resource_group_name: default
               time: 2024-09-06 17:17:42
         match_type: watch
             action: kill
       original_sql: SELECT variable_name, variable_value FROM mysql.global_variables
        plan_digest: e5003a13344e26b749329d8e96cae94e844a11bb7e027bd3ea5c2a39f7c07ee7
        tidb_server: 127.0.0.1:4000
               rule: ProcessedKeys = 666(10)

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Documentation

  • Contains experimental features

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

runaway: Add processed_keys and request_ru to check

Copy link

ti-chi-bot bot commented Aug 2, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 2, 2024
Copy link

tiprow bot commented Aug 2, 2024

Hi @HuSharp. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 12, 2024
@HuSharp HuSharp changed the title [DNM] runaway: add more params [DNM] runaway: Add processed_keys and request_ru to check Aug 16, 2024
@ti-chi-bot ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Aug 16, 2024
@HuSharp HuSharp changed the title [DNM] runaway: Add processed_keys and request_ru to check runaway: Add processed_keys and request_ru to check Aug 16, 2024
@ti-chi-bot ti-chi-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 16, 2024
Copy link

ti-chi-bot bot commented Aug 16, 2024

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed do-not-merge/needs-linked-issue labels Aug 16, 2024
@HuSharp
Copy link
Contributor Author

HuSharp commented Aug 21, 2024

/test all

Copy link

tiprow bot commented Aug 21, 2024

@HuSharp: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

codecov bot commented Aug 21, 2024

Codecov Report

Attention: Patch coverage is 82.90909% with 47 lines in your changes missing coverage. Please review.

Project coverage is 75.8774%. Comparing base (e87affb) to head (30156c0).
Report is 1 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #55147        +/-   ##
================================================
+ Coverage   73.4116%   75.8774%   +2.4658%     
================================================
  Files          1601       1653        +52     
  Lines        444010     454562     +10552     
================================================
+ Hits         325955     344910     +18955     
+ Misses        98086      88121      -9965     
- Partials      19969      21531      +1562     
Flag Coverage Δ
integration 51.9031% <16.0000%> (?)
unit 72.5323% <82.9090%> (+0.0565%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.9478% <ø> (ø)
parser ∅ <ø> (∅)
br 62.9171% <ø> (+17.0309%) ⬆️

@HuSharp
Copy link
Contributor Author

HuSharp commented Aug 22, 2024

/test all

Copy link

tiprow bot commented Aug 22, 2024

@HuSharp: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@HuSharp HuSharp marked this pull request as ready for review August 23, 2024 02:12
@ti-chi-bot ti-chi-bot bot removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 23, 2024
@HuSharp
Copy link
Contributor Author

HuSharp commented Sep 2, 2024

/retest

Signed-off-by: husharp <[email protected]>
@glorv
Copy link
Contributor

glorv commented Sep 14, 2024

/retest

@hawkingrei
Copy link
Member

/test all

@nolouch
Copy link
Member

nolouch commented Sep 19, 2024

/retest-required

Signed-off-by: husharp <[email protected]>
@HuSharp
Copy link
Contributor Author

HuSharp commented Sep 19, 2024

@glorv @JmPotato PTAL, thx~

@HuSharp
Copy link
Contributor Author

HuSharp commented Sep 19, 2024

/retest-required

pkg/parser/go.mod Show resolved Hide resolved
@@ -222,6 +222,8 @@ func SetDirectResourceGroupSettings(groupInfo *model.ResourceGroupInfo, opt *ast
case ast.ResourceGroupRunaway:
if len(opt.RunawayOptionList) == 0 {
resourceGroupSettings.Runaway = nil
} else {
resourceGroupSettings.Runaway = &model.ResourceGroupRunawaySettings{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part changed the behavior, for the old code, if resourceGroupSettings.Runaway is not nil, it won't be overwritten in SetDirectResourceGroupRunawayOption, but now it will be reset to an empty new one if len(opt.RunawayOptionList) > 0. Will this be a problem?

Copy link
Contributor Author

@HuSharp HuSharp Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, fixed. Yes, the new behavior has been discussed with PM.

@@ -915,6 +918,10 @@ func (sender *copIteratorTaskSender) run(connID uint64) {
if sender.respChan != nil {
close(sender.respChan)
}
if checker != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to check since ResetTotalProcessedKeys will do it also.

Copy link
Contributor Author

@HuSharp HuSharp Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct's checker field is of type RunawayChecker, which is an interface, but it hasn't been initialized with a instance. When calling checker.ResetTotalProcessedKeys(), the value of checker is nil, leading to a runtime error: invalid memory address or nil pointer dereference.

You can check this playground

Instead of the following approach, maybe we can judge in ResetTotalProcessedKeys?

if checker != nil {
    if rcChecker, ok := checker.(*Checker); ok && rcChecker != nil {
        rcChecker.ResetTotalProcessedKeys()
	}

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Sep 20, 2024
Copy link

ti-chi-bot bot commented Sep 20, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-09-12 09:52:49.453778086 +0000 UTC m=+522839.194202023: ☑️ agreed by nolouch.
  • 2024-09-20 07:22:20.585327261 +0000 UTC m=+1205010.325751200: ☑️ agreed by glorv.

@glorv
Copy link
Contributor

glorv commented Sep 20, 2024

@tangenta PTAL

@HuSharp
Copy link
Contributor Author

HuSharp commented Sep 20, 2024

/retest-required

Signed-off-by: husharp <[email protected]>
@glorv
Copy link
Contributor

glorv commented Sep 23, 2024

/retest

Signed-off-by: husharp <[email protected]>
Signed-off-by: husharp <[email protected]>
Copy link

@yudongusa yudongusa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

ti-chi-bot bot commented Sep 24, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: glorv, nolouch, tangenta, yudongusa

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the approved label Sep 24, 2024
@HuSharp
Copy link
Contributor Author

HuSharp commented Sep 24, 2024

/retest-required

@ti-chi-bot ti-chi-bot bot merged commit 19caf52 into pingcap:master Sep 24, 2024
25 checks passed
@HuSharp HuSharp deleted the add_more_param_in_runaway branch September 24, 2024 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm ok-to-test Indicates a PR is ready to be tested. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants