chore(ci): simplify the flask simple benchmark suite #8902

brettlangdon · 2024-04-08T20:11:27Z

This change aims to simplify the Flask simple benchmark suite by using the Flask test client instead of using gunicorn to spin up a subprocess server + requests to make http requests to the server.

The primary goal was to simplify the code/coordination needed for the test, and to make the test suite faster.

The downside is we are moving away from a theoretical "end user experience" latest measurement to more of a "worse case" since we are removing network and server latency from the equation. However, removing these pieces should give us more stable numbers since there are less moving pieces.

If we choose to adopt this new approach then the existing historical trends/measurements will no longer be comparable.

Checklist

Change(s) are motivated and described in the PR description
Testing strategy is described if automated tests are not included in the PR
Risks are described (performance impact, potential for breakage, maintainability)
Change is maintainable (easy to change, telemetry, documentation)
Library release note guidelines are followed or label changelog/no-changelog is set
Documentation is included (in-code, generated user docs, public corp docs)
Backport labels are set (if applicable)
If this PR changes the public interface, I've notified @DataDog/apm-tees.

Reviewer Checklist

Title is accurate
All changes are related to the pull request's stated goal
Description motivates each change
Avoids breaking API changes
Testing strategy adequately addresses listed risks
Change is maintainable (easy to change, telemetry, documentation)
Release note makes sense to a user of the library
Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
Backport labels are set in a manner that is consistent with the release branch maintenance policy

benchmarks/flask_simple/scenario.py

brettlangdon · 2024-04-08T20:26:17Z

This benchmark is just hanging in CI, because there isn't a trace agent available it is just spamming with "cannot send traces/profiles..." messages.

pr-commenter · 2024-04-08T21:19:38Z

Benchmarks

Benchmark execution time: 2024-04-30 18:12:59

Comparing candidate commit 3a91539 in PR branch brettlangdon/simplify.flask_simple with baseline commit 94cd2ee in branch main.

Found 29 performance improvements and 8 performance regressions! Performance is the same for 172 metrics, 9 unstable metrics.

scenario:coreapiscenario-context_with_data_listeners_and_all_listeners

🟩 max_rss_usage [-604.847KB; -529.745KB] or [-2.831%; -2.480%]

scenario:coreapiscenario-context_with_data_no_listeners

🟩 max_rss_usage [-759.330KB; -682.462KB] or [-3.556%; -3.196%]

scenario:coreapiscenario-core_dispatch_listeners

🟩 max_rss_usage [-719.782KB; -652.788KB] or [-3.376%; -3.062%]

scenario:coreapiscenario-core_dispatch_listeners_and_all_listeners

🟩 max_rss_usage [-701.343KB; -617.569KB] or [-3.291%; -2.898%]

scenario:coreapiscenario-core_dispatch_only_all_listeners

🟩 max_rss_usage [-764.968KB; -685.835KB] or [-3.583%; -3.213%]

scenario:coreapiscenario-core_dispatch_with_results_no_listeners

🟩 max_rss_usage [-692.239KB; -611.927KB] or [-3.249%; -2.872%]

scenario:coreapiscenario-core_dispatch_with_results_only_all_listeners

🟩 max_rss_usage [-662.199KB; -584.214KB] or [-3.095%; -2.731%]

scenario:coreapiscenario-get_item_exists

🟩 max_rss_usage [-672.146KB; -588.193KB] or [-3.155%; -2.761%]

scenario:coreapiscenario-get_item_missing

🟩 max_rss_usage [-686.483KB; -613.178KB] or [-3.222%; -2.878%]

scenario:coreapiscenario-set_item

🟩 max_rss_usage [-707.224KB; -628.481KB] or [-3.321%; -2.951%]

scenario:flasksimple-appsec-get

🟩 execution_time [-3.283ms; -3.236ms] or [-50.088%; -49.369%]
🟩 max_rss_usage [-19.399MB; -19.344MB] or [-36.407%; -36.305%]

scenario:flasksimple-appsec-post

🟩 execution_time [-2.717ms; -2.701ms] or [-48.909%; -48.613%]
🟩 max_rss_usage [-19.374MB; -19.306MB] or [-36.351%; -36.224%]

scenario:flasksimple-appsec-telemetry

🟩 execution_time [-3.260ms; -3.207ms] or [-49.877%; -49.061%]
🟩 max_rss_usage [-19.447MB; -19.392MB] or [-36.488%; -36.385%]

scenario:flasksimple-baseline

🟩 execution_time [-3.011ms; -2.996ms] or [-63.906%; -63.577%]
🟩 max_rss_usage [-14.133MB; -14.065MB] or [-30.774%; -30.627%]

scenario:flasksimple-debugger

🟩 execution_time [-3.393ms; -3.367ms] or [-65.934%; -65.428%]
🟩 max_rss_usage [-14.575MB; -14.516MB] or [-31.402%; -31.275%]

scenario:flasksimple-iast-get

🟩 execution_time [-2.995ms; -2.987ms] or [-63.770%; -63.597%]
🟩 max_rss_usage [-14.093MB; -14.022MB] or [-30.694%; -30.539%]

scenario:flasksimple-profiler

🟩 execution_time [-2.801ms; -2.790ms] or [-59.806%; -59.567%]
🟩 max_rss_usage [-11.381MB; -11.317MB] or [-24.778%; -24.638%]

scenario:flasksimple-tracer

🟩 execution_time [-3.273ms; -3.223ms] or [-50.029%; -49.255%]
🟩 max_rss_usage [-19.385MB; -19.320MB] or [-36.381%; -36.258%]

scenario:httppropagationinject-with_all

🟥 max_rss_usage [+725.461KB; +779.819KB] or [+3.531%; +3.796%]

scenario:httppropagationinject-with_priority_and_origin

🟩 max_rss_usage [-692.823KB; -635.101KB] or [-3.260%; -2.988%]

scenario:httppropagationinject-with_tags_invalid

🟥 max_rss_usage [+434.923KB; +589.896KB] or [+2.101%; +2.850%]

scenario:httppropagationinject-with_tags_max_size

🟩 max_rss_usage [-1.065MB; -0.960MB] or [-5.001%; -4.508%]

scenario:otelspan-start-finish

🟩 max_rss_usage [-772.751KB; -612.926KB] or [-3.431%; -2.722%]

scenario:sethttpmeta-no-collectipvariant

🟥 max_rss_usage [+716.935KB; +792.031KB] or [+3.435%; +3.794%]

scenario:sethttpmeta-obfuscation-no-query

🟥 max_rss_usage [+746.481KB; +817.372KB] or [+3.579%; +3.919%]

scenario:sethttpmeta-useragentvariant_not_exists_1

🟥 max_rss_usage [+698.485KB; +777.303KB] or [+3.345%; +3.723%]

scenario:tracer-large

🟥 max_rss_usage [+680.951KB; +758.793KB] or [+3.133%; +3.491%]

scenario:tracer-medium

🟥 max_rss_usage [+545.185KB; +620.946KB] or [+2.629%; +2.994%]

scenario:tracer-small

🟥 max_rss_usage [+674.065KB; +753.391KB] or [+3.269%; +3.653%]

brettlangdon · 2024-04-08T22:53:44Z

Benchmarks

Benchmark execution time: 2024-04-08 21:19:36

Comparing candidate commit 4b760b7 in PR branch brettlangdon/simplify.flask_simple with baseline commit 698021b in branch main.

Found 19 performance improvements and 1 performance regressions! Performance is the same for 181 metrics, 9 unstable metrics.

scenario:flasksimple-appsec-get

🟩 execution_time [-3.211ms; -3.157ms] or [-49.415%; -48.588%]

🟩 max_rss_usage [-19.409MB; -19.286MB] or [-36.203%; -35.974%]

scenario:flasksimple-appsec-post

🟩 execution_time [-2.677ms; -2.662ms] or [-48.522%; -48.241%]

🟩 max_rss_usage [-19.476MB; -19.413MB] or [-36.243%; -36.126%]

scenario:flasksimple-appsec-telemetry

🟩 execution_time [-3.220ms; -3.174ms] or [-49.410%; -48.715%]

🟩 max_rss_usage [-19.337MB; -19.217MB] or [-36.099%; -35.876%]

scenario:flasksimple-baseline

🟩 execution_time [-3.043ms; -3.030ms] or [-64.005%; -63.724%]

🟩 max_rss_usage [-13.806MB; -13.727MB] or [-29.846%; -29.676%]

scenario:flasksimple-debugger

🟩 execution_time [-4.268ms; -4.124ms] or [-67.675%; -65.397%]

🟩 max_rss_usage [-14.654MB; -14.586MB] or [-31.054%; -30.910%]

scenario:flasksimple-iast-get

🟩 execution_time [-3.035ms; -3.023ms] or [-63.961%; -63.712%]

🟩 max_rss_usage [-13.825MB; -13.754MB] or [-29.890%; -29.736%]

scenario:flasksimple-profiler

🟩 execution_time [-2.866ms; -2.853ms] or [-60.313%; -60.047%]

🟩 max_rss_usage [-11.129MB; -11.051MB] or [-24.072%; -23.903%]

scenario:flasksimple-tracer

🟩 execution_time [-2.978ms; -2.965ms] or [-47.389%; -47.173%]

🟩 max_rss_usage [-19.368MB; -19.245MB] or [-36.162%; -35.932%]

scenario:httppropagationextract-all_styles_all_headers

🟩 max_rss_usage [-799.792KB; -476.931KB] or [-3.695%; -2.203%]

scenario:httppropagationextract-large_header_no_matches

🟩 max_rss_usage [-706.669KB; -455.367KB] or [-3.251%; -2.095%]

scenario:httppropagationextract-none_propagation_style

🟩 max_rss_usage [-722.020KB; -622.696KB] or [-3.301%; -2.846%]

scenario:httppropagationextract-wsgi_invalid_trace_id_header

🟥 max_rss_usage [+571.452KB; +839.620KB] or [+2.745%; +4.033%]

This can mostly be ignored, we cannot compare this against the previous... nothing actually got faster other than the test itself.

The results from our benchmarking CI job is very different from what I get locally:

CI:

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
flasksimple-tracer	execution_time	3.272ms	3.313ms ± 0.018ms	3.314ms ± 0.013ms	3.326ms	3.340ms	3.350ms	3.352ms	1.17%	-0.028	-0.402	0.53%	0.002ms	20	60
flasksimple-baseline	execution_time	1.707ms	1.718ms ± 0.005ms	1.718ms ± 0.003ms	1.722ms	1.724ms	1.730ms	1.731ms	0.77%	-0.031	-0.058	0.31%	0.001ms	20	60
flasksimple-debugger	execution_time	1.864ms	2.110ms ± 0.282ms	1.949ms ± 0.070ms	2.502ms	2.508ms	2.514ms	2.517ms	29.11%	0.679	-1.499	13.27%	0.036ms	20	60
flasksimple-iast-get	execution_time	1.707ms	1.716ms ± 0.005ms	1.715ms ± 0.002ms	1.717ms	1.726ms	1.727ms	1.727ms	0.70%	0.731	-0.224	0.30%	0.001ms	20	60
flasksimple-profiler	execution_time	1.878ms	1.892ms ± 0.006ms	1.892ms ± 0.004ms	1.897ms	1.903ms	1.905ms	1.905ms	0.70%	0.144	-0.433	0.32%	0.001ms	20	60
flasksimple-appsec-get	execution_time	3.272ms	3.314ms ± 0.018ms	3.313ms ± 0.014ms	3.329ms	3.341ms	3.350ms	3.354ms	1.23%	-0.025	-0.507	0.53%	0.002ms	20	60
flasksimple-appsec-post	execution_time	2.818ms	2.848ms ± 0.014ms	2.848ms ± 0.010ms	2.858ms	2.872ms	2.878ms	2.879ms	1.07%	0.007	-0.535	0.50%	0.002ms	20	60
flasksimple-appsec-telemetry	execution_time	3.288ms	3.319ms ± 0.016ms	3.319ms ± 0.012ms	3.330ms	3.346ms	3.356ms	3.357ms	1.15%	0.242	-0.622	0.49%	0.002ms	20	60

Local:

❯ pyperf show 2.9.0.dev35+g698021b1b/results.json
flasksimple-baseline: Mean +- std dev: 552 us +- 14 us
flasksimple-tracer: Mean +- std dev: 1.12 ms +- 0.03 ms
flasksimple-profiler: Mean +- std dev: 684 us +- 13 us
flasksimple-debugger: Mean +- std dev: 564 us +- 17 us
flasksimple-iast-get: Mean +- std dev: 555 us +- 18 us
flasksimple-appsec-get: Mean +- std dev: 1.12 ms +- 0.02 ms
flasksimple-appsec-post: Mean +- std dev: 1.02 ms +- 0.02 ms
flasksimple-appsec-telemetry: Mean +- std dev: 1.11 ms +- 0.03 ms

emmettbutler · 2024-04-09T14:23:35Z

@brettlangdon let me know when this is ready for review. As far as I can tell from the scrollback, you're still working on it?

…unner (#8917) ## Checklist - [ ] Change(s) are motivated and described in the PR description - [ ] Testing strategy is described if automated tests are not included in the PR - [ ] Risks are described (performance impact, potential for breakage, maintainability) - [ ] Change is maintainable (easy to change, telemetry, documentation) - [ ] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [ ] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [ ] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [ ] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. ## Reviewer Checklist - [ ] Title is accurate - [ ] All changes are related to the pull request's stated goal - [ ] Description motivates each change - [ ] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [ ] Testing strategy adequately addresses listed risks - [ ] Change is maintainable (easy to change, telemetry, documentation) - [ ] Release note makes sense to a user of the library - [ ] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [ ] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

….flask_simple

brettlangdon · 2024-05-01T11:42:34Z

@emmettbutler this is now ready for review.

…ead of gunicorn (#9253) Follow up from #8902 to also convert the `flask_sqli` benchmark to using the Flask test client instead of gunicorn. This cuts the benchmark runtime from 25min to 5min. We should be testing the same code paths, but we avoid the need to spin up a subprocess/server and make network requests to it. We also refactored some common bits from both `flask_simple` and `flask_sqli` to ensure we are configuring them the same way. This will look like a performance improvement, but it isn't. It is the test itself getting faster. ## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. ## Reviewer Checklist - [ ] Title is accurate - [ ] All changes are related to the pull request's stated goal - [ ] Description motivates each change - [ ] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [ ] Testing strategy adequately addresses listed risks - [ ] Change is maintainable (easy to change, telemetry, documentation) - [ ] Release note makes sense to a user of the library - [ ] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [ ] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

brettlangdon added 2 commits April 8, 2024 11:25

prototype: make the flask_simple benchmark more simple

22e8eb6

fix issue with flask.request span not closing

40d52d3

brettlangdon added the changelog/no-changelog A changelog entry is not required for this PR. label Apr 8, 2024

brettlangdon requested a review from a team as a code owner April 8, 2024 20:11

brettlangdon requested a review from erikayasuda April 8, 2024 20:11

brettlangdon added 2 commits April 8, 2024 16:11

Merge branch 'main' into brettlangdon/simplify.flask_simple

7866500

fix formatting

4b760b7

brettlangdon enabled auto-merge (squash) April 8, 2024 20:15

brettlangdon requested review from a team and juanjux and removed request for a team April 8, 2024 20:17

github-advanced-security bot found potential problems Apr 8, 2024

View reviewed changes

benchmarks/flask_simple/scenario.py Dismissed Show dismissed Hide dismissed

brettlangdon disabled auto-merge April 8, 2024 20:25

emmettbutler self-requested a review April 8, 2024 20:26

brettlangdon added 3 commits April 9, 2024 09:16

try running an agent in gitlab

d3da08f

try specifying services another way

8a44f40

use unix socket url

867ec8b

brettlangdon marked this pull request as draft April 9, 2024 15:03

ddyurchenko and others added 4 commits April 30, 2024 12:29

Merge remote-tracking branch 'origin/main' into brettlangdon/simplify…

4c9e30a

….flask_simple

whitespace cleanup

26bdeab

Merge remote-tracking branch 'origin/main' into brettlangdon/simplify…

3a91539

….flask_simple

brettlangdon marked this pull request as ready for review May 1, 2024 11:40

Merge branch 'main' into brettlangdon/simplify.flask_simple

bae7000

emmettbutler approved these changes May 2, 2024

View reviewed changes

brettlangdon enabled auto-merge (squash) May 3, 2024 14:01

Merge branch 'main' into brettlangdon/simplify.flask_simple

faea270

brettlangdon merged commit d837ff5 into main May 3, 2024
45 checks passed

brettlangdon deleted the brettlangdon/simplify.flask_simple branch May 3, 2024 14:04

brettlangdon mentioned this pull request May 13, 2024

chore(ci): convert flask_sqli benchmark to use Flask test client instead of gunicorn #9253

Merged

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(ci): simplify the flask simple benchmark suite #8902

chore(ci): simplify the flask simple benchmark suite #8902

brettlangdon commented Apr 8, 2024 •

edited by emmettbutler

Loading

brettlangdon commented Apr 8, 2024

pr-commenter bot commented Apr 8, 2024 •

edited

Loading

brettlangdon commented Apr 8, 2024

Benchmarks

scenario:flasksimple-appsec-get

scenario:flasksimple-appsec-post

scenario:flasksimple-appsec-telemetry

scenario:flasksimple-baseline

scenario:flasksimple-debugger

scenario:flasksimple-iast-get

scenario:flasksimple-profiler

scenario:flasksimple-tracer

scenario:httppropagationextract-all_styles_all_headers

scenario:httppropagationextract-large_header_no_matches

scenario:httppropagationextract-none_propagation_style

scenario:httppropagationextract-wsgi_invalid_trace_id_header

emmettbutler commented Apr 9, 2024

brettlangdon commented May 1, 2024

chore(ci): simplify the flask simple benchmark suite #8902

chore(ci): simplify the flask simple benchmark suite #8902

Conversation

brettlangdon commented Apr 8, 2024 • edited by emmettbutler Loading

Checklist

Reviewer Checklist

brettlangdon commented Apr 8, 2024

pr-commenter bot commented Apr 8, 2024 • edited Loading

Benchmarks

scenario:coreapiscenario-context_with_data_listeners_and_all_listeners

scenario:coreapiscenario-context_with_data_no_listeners

scenario:coreapiscenario-core_dispatch_listeners

scenario:coreapiscenario-core_dispatch_listeners_and_all_listeners

scenario:coreapiscenario-core_dispatch_only_all_listeners

scenario:coreapiscenario-core_dispatch_with_results_no_listeners

scenario:coreapiscenario-core_dispatch_with_results_only_all_listeners

scenario:coreapiscenario-get_item_exists

scenario:coreapiscenario-get_item_missing

scenario:coreapiscenario-set_item

scenario:flasksimple-appsec-get

scenario:flasksimple-appsec-post

scenario:flasksimple-appsec-telemetry

scenario:flasksimple-baseline

scenario:flasksimple-debugger

scenario:flasksimple-iast-get

scenario:flasksimple-profiler

scenario:flasksimple-tracer

scenario:httppropagationinject-with_all

scenario:httppropagationinject-with_priority_and_origin

scenario:httppropagationinject-with_tags_invalid

scenario:httppropagationinject-with_tags_max_size

scenario:otelspan-start-finish

scenario:sethttpmeta-no-collectipvariant

scenario:sethttpmeta-obfuscation-no-query

scenario:sethttpmeta-useragentvariant_not_exists_1

scenario:tracer-large

scenario:tracer-medium

scenario:tracer-small

brettlangdon commented Apr 8, 2024

Benchmarks

scenario:flasksimple-appsec-get

scenario:flasksimple-appsec-post

scenario:flasksimple-appsec-telemetry

scenario:flasksimple-baseline

scenario:flasksimple-debugger

scenario:flasksimple-iast-get

scenario:flasksimple-profiler

scenario:flasksimple-tracer

scenario:httppropagationextract-all_styles_all_headers

scenario:httppropagationextract-large_header_no_matches

scenario:httppropagationextract-none_propagation_style

scenario:httppropagationextract-wsgi_invalid_trace_id_header

emmettbutler commented Apr 9, 2024

brettlangdon commented May 1, 2024

brettlangdon commented Apr 8, 2024 •

edited by emmettbutler

Loading

pr-commenter bot commented Apr 8, 2024 •

edited

Loading