Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(ci): simplify the flask simple benchmark suite #8902

Merged
merged 13 commits into from
May 3, 2024

Conversation

brettlangdon
Copy link
Member

@brettlangdon brettlangdon commented Apr 8, 2024

This change aims to simplify the Flask simple benchmark suite by using the Flask test client instead of using gunicorn to spin up a subprocess server + requests to make http requests to the server.

The primary goal was to simplify the code/coordination needed for the test, and to make the test suite faster.

The downside is we are moving away from a theoretical "end user experience" latest measurement to more of a "worse case" since we are removing network and server latency from the equation. However, removing these pieces should give us more stable numbers since there are less moving pieces.

If we choose to adopt this new approach then the existing historical trends/measurements will no longer be comparable.

Checklist

  • Change(s) are motivated and described in the PR description
  • Testing strategy is described if automated tests are not included in the PR
  • Risks are described (performance impact, potential for breakage, maintainability)
  • Change is maintainable (easy to change, telemetry, documentation)
  • Library release note guidelines are followed or label changelog/no-changelog is set
  • Documentation is included (in-code, generated user docs, public corp docs)
  • Backport labels are set (if applicable)
  • If this PR changes the public interface, I've notified @DataDog/apm-tees.

Reviewer Checklist

  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Description motivates each change
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Change is maintainable (easy to change, telemetry, documentation)
  • Release note makes sense to a user of the library
  • Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

@brettlangdon brettlangdon added the changelog/no-changelog A changelog entry is not required for this PR. label Apr 8, 2024
@brettlangdon brettlangdon requested a review from a team as a code owner April 8, 2024 20:11
@brettlangdon brettlangdon enabled auto-merge (squash) April 8, 2024 20:15
@brettlangdon brettlangdon requested review from a team and juanjux and removed request for a team April 8, 2024 20:17
benchmarks/flask_simple/scenario.py Dismissed Show dismissed Hide dismissed
@brettlangdon
Copy link
Member Author

This benchmark is just hanging in CI, because there isn't a trace agent available it is just spamming with "cannot send traces/profiles..." messages.

@emmettbutler emmettbutler self-requested a review April 8, 2024 20:26
@pr-commenter
Copy link

pr-commenter bot commented Apr 8, 2024

Benchmarks

Benchmark execution time: 2024-04-30 18:12:59

Comparing candidate commit 3a91539 in PR branch brettlangdon/simplify.flask_simple with baseline commit 94cd2ee in branch main.

Found 29 performance improvements and 8 performance regressions! Performance is the same for 172 metrics, 9 unstable metrics.

scenario:coreapiscenario-context_with_data_listeners_and_all_listeners

  • 🟩 max_rss_usage [-604.847KB; -529.745KB] or [-2.831%; -2.480%]

scenario:coreapiscenario-context_with_data_no_listeners

  • 🟩 max_rss_usage [-759.330KB; -682.462KB] or [-3.556%; -3.196%]

scenario:coreapiscenario-core_dispatch_listeners

  • 🟩 max_rss_usage [-719.782KB; -652.788KB] or [-3.376%; -3.062%]

scenario:coreapiscenario-core_dispatch_listeners_and_all_listeners

  • 🟩 max_rss_usage [-701.343KB; -617.569KB] or [-3.291%; -2.898%]

scenario:coreapiscenario-core_dispatch_only_all_listeners

  • 🟩 max_rss_usage [-764.968KB; -685.835KB] or [-3.583%; -3.213%]

scenario:coreapiscenario-core_dispatch_with_results_no_listeners

  • 🟩 max_rss_usage [-692.239KB; -611.927KB] or [-3.249%; -2.872%]

scenario:coreapiscenario-core_dispatch_with_results_only_all_listeners

  • 🟩 max_rss_usage [-662.199KB; -584.214KB] or [-3.095%; -2.731%]

scenario:coreapiscenario-get_item_exists

  • 🟩 max_rss_usage [-672.146KB; -588.193KB] or [-3.155%; -2.761%]

scenario:coreapiscenario-get_item_missing

  • 🟩 max_rss_usage [-686.483KB; -613.178KB] or [-3.222%; -2.878%]

scenario:coreapiscenario-set_item

  • 🟩 max_rss_usage [-707.224KB; -628.481KB] or [-3.321%; -2.951%]

scenario:flasksimple-appsec-get

  • 🟩 execution_time [-3.283ms; -3.236ms] or [-50.088%; -49.369%]
  • 🟩 max_rss_usage [-19.399MB; -19.344MB] or [-36.407%; -36.305%]

scenario:flasksimple-appsec-post

  • 🟩 execution_time [-2.717ms; -2.701ms] or [-48.909%; -48.613%]
  • 🟩 max_rss_usage [-19.374MB; -19.306MB] or [-36.351%; -36.224%]

scenario:flasksimple-appsec-telemetry

  • 🟩 execution_time [-3.260ms; -3.207ms] or [-49.877%; -49.061%]
  • 🟩 max_rss_usage [-19.447MB; -19.392MB] or [-36.488%; -36.385%]

scenario:flasksimple-baseline

  • 🟩 execution_time [-3.011ms; -2.996ms] or [-63.906%; -63.577%]
  • 🟩 max_rss_usage [-14.133MB; -14.065MB] or [-30.774%; -30.627%]

scenario:flasksimple-debugger

  • 🟩 execution_time [-3.393ms; -3.367ms] or [-65.934%; -65.428%]
  • 🟩 max_rss_usage [-14.575MB; -14.516MB] or [-31.402%; -31.275%]

scenario:flasksimple-iast-get

  • 🟩 execution_time [-2.995ms; -2.987ms] or [-63.770%; -63.597%]
  • 🟩 max_rss_usage [-14.093MB; -14.022MB] or [-30.694%; -30.539%]

scenario:flasksimple-profiler

  • 🟩 execution_time [-2.801ms; -2.790ms] or [-59.806%; -59.567%]
  • 🟩 max_rss_usage [-11.381MB; -11.317MB] or [-24.778%; -24.638%]

scenario:flasksimple-tracer

  • 🟩 execution_time [-3.273ms; -3.223ms] or [-50.029%; -49.255%]
  • 🟩 max_rss_usage [-19.385MB; -19.320MB] or [-36.381%; -36.258%]

scenario:httppropagationinject-with_all

  • 🟥 max_rss_usage [+725.461KB; +779.819KB] or [+3.531%; +3.796%]

scenario:httppropagationinject-with_priority_and_origin

  • 🟩 max_rss_usage [-692.823KB; -635.101KB] or [-3.260%; -2.988%]

scenario:httppropagationinject-with_tags_invalid

  • 🟥 max_rss_usage [+434.923KB; +589.896KB] or [+2.101%; +2.850%]

scenario:httppropagationinject-with_tags_max_size

  • 🟩 max_rss_usage [-1.065MB; -0.960MB] or [-5.001%; -4.508%]

scenario:otelspan-start-finish

  • 🟩 max_rss_usage [-772.751KB; -612.926KB] or [-3.431%; -2.722%]

scenario:sethttpmeta-no-collectipvariant

  • 🟥 max_rss_usage [+716.935KB; +792.031KB] or [+3.435%; +3.794%]

scenario:sethttpmeta-obfuscation-no-query

  • 🟥 max_rss_usage [+746.481KB; +817.372KB] or [+3.579%; +3.919%]

scenario:sethttpmeta-useragentvariant_not_exists_1

  • 🟥 max_rss_usage [+698.485KB; +777.303KB] or [+3.345%; +3.723%]

scenario:tracer-large

  • 🟥 max_rss_usage [+680.951KB; +758.793KB] or [+3.133%; +3.491%]

scenario:tracer-medium

  • 🟥 max_rss_usage [+545.185KB; +620.946KB] or [+2.629%; +2.994%]

scenario:tracer-small

  • 🟥 max_rss_usage [+674.065KB; +753.391KB] or [+3.269%; +3.653%]

@brettlangdon
Copy link
Member Author

Benchmarks

Benchmark execution time: 2024-04-08 21:19:36

Comparing candidate commit 4b760b7 in PR branch brettlangdon/simplify.flask_simple with baseline commit 698021b in branch main.

Found 19 performance improvements and 1 performance regressions! Performance is the same for 181 metrics, 9 unstable metrics.

scenario:flasksimple-appsec-get

  • 🟩 execution_time [-3.211ms; -3.157ms] or [-49.415%; -48.588%]
  • 🟩 max_rss_usage [-19.409MB; -19.286MB] or [-36.203%; -35.974%]

scenario:flasksimple-appsec-post

  • 🟩 execution_time [-2.677ms; -2.662ms] or [-48.522%; -48.241%]
  • 🟩 max_rss_usage [-19.476MB; -19.413MB] or [-36.243%; -36.126%]

scenario:flasksimple-appsec-telemetry

  • 🟩 execution_time [-3.220ms; -3.174ms] or [-49.410%; -48.715%]
  • 🟩 max_rss_usage [-19.337MB; -19.217MB] or [-36.099%; -35.876%]

scenario:flasksimple-baseline

  • 🟩 execution_time [-3.043ms; -3.030ms] or [-64.005%; -63.724%]
  • 🟩 max_rss_usage [-13.806MB; -13.727MB] or [-29.846%; -29.676%]

scenario:flasksimple-debugger

  • 🟩 execution_time [-4.268ms; -4.124ms] or [-67.675%; -65.397%]
  • 🟩 max_rss_usage [-14.654MB; -14.586MB] or [-31.054%; -30.910%]

scenario:flasksimple-iast-get

  • 🟩 execution_time [-3.035ms; -3.023ms] or [-63.961%; -63.712%]
  • 🟩 max_rss_usage [-13.825MB; -13.754MB] or [-29.890%; -29.736%]

scenario:flasksimple-profiler

  • 🟩 execution_time [-2.866ms; -2.853ms] or [-60.313%; -60.047%]
  • 🟩 max_rss_usage [-11.129MB; -11.051MB] or [-24.072%; -23.903%]

scenario:flasksimple-tracer

  • 🟩 execution_time [-2.978ms; -2.965ms] or [-47.389%; -47.173%]
  • 🟩 max_rss_usage [-19.368MB; -19.245MB] or [-36.162%; -35.932%]

scenario:httppropagationextract-all_styles_all_headers

  • 🟩 max_rss_usage [-799.792KB; -476.931KB] or [-3.695%; -2.203%]

scenario:httppropagationextract-large_header_no_matches

  • 🟩 max_rss_usage [-706.669KB; -455.367KB] or [-3.251%; -2.095%]

scenario:httppropagationextract-none_propagation_style

  • 🟩 max_rss_usage [-722.020KB; -622.696KB] or [-3.301%; -2.846%]

scenario:httppropagationextract-wsgi_invalid_trace_id_header

  • 🟥 max_rss_usage [+571.452KB; +839.620KB] or [+2.745%; +4.033%]

This can mostly be ignored, we cannot compare this against the previous... nothing actually got faster other than the test itself.

The results from our benchmarking CI job is very different from what I get locally:

CI:

scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
flasksimple-tracer execution_time 3.272ms 3.313ms ± 0.018ms 3.314ms ± 0.013ms 3.326ms 3.340ms 3.350ms 3.352ms 1.17% -0.028 -0.402 0.53% 0.002ms 20 60
flasksimple-baseline execution_time 1.707ms 1.718ms ± 0.005ms 1.718ms ± 0.003ms 1.722ms 1.724ms 1.730ms 1.731ms 0.77% -0.031 -0.058 0.31% 0.001ms 20 60
flasksimple-debugger execution_time 1.864ms 2.110ms ± 0.282ms 1.949ms ± 0.070ms 2.502ms 2.508ms 2.514ms 2.517ms 29.11% 0.679 -1.499 13.27% 0.036ms 20 60
flasksimple-iast-get execution_time 1.707ms 1.716ms ± 0.005ms 1.715ms ± 0.002ms 1.717ms 1.726ms 1.727ms 1.727ms 0.70% 0.731 -0.224 0.30% 0.001ms 20 60
flasksimple-profiler execution_time 1.878ms 1.892ms ± 0.006ms 1.892ms ± 0.004ms 1.897ms 1.903ms 1.905ms 1.905ms 0.70% 0.144 -0.433 0.32% 0.001ms 20 60
flasksimple-appsec-get execution_time 3.272ms 3.314ms ± 0.018ms 3.313ms ± 0.014ms 3.329ms 3.341ms 3.350ms 3.354ms 1.23% -0.025 -0.507 0.53% 0.002ms 20 60
flasksimple-appsec-post execution_time 2.818ms 2.848ms ± 0.014ms 2.848ms ± 0.010ms 2.858ms 2.872ms 2.878ms 2.879ms 1.07% 0.007 -0.535 0.50% 0.002ms 20 60
flasksimple-appsec-telemetry execution_time 3.288ms 3.319ms ± 0.016ms 3.319ms ± 0.012ms 3.330ms 3.346ms 3.356ms 3.357ms 1.15% 0.242 -0.622 0.49% 0.002ms 20 60

Local:

❯ pyperf show 2.9.0.dev35+g698021b1b/results.json
flasksimple-baseline: Mean +- std dev: 552 us +- 14 us
flasksimple-tracer: Mean +- std dev: 1.12 ms +- 0.03 ms
flasksimple-profiler: Mean +- std dev: 684 us +- 13 us
flasksimple-debugger: Mean +- std dev: 564 us +- 17 us
flasksimple-iast-get: Mean +- std dev: 555 us +- 18 us
flasksimple-appsec-get: Mean +- std dev: 1.12 ms +- 0.02 ms
flasksimple-appsec-post: Mean +- std dev: 1.02 ms +- 0.02 ms
flasksimple-appsec-telemetry: Mean +- std dev: 1.11 ms +- 0.03 ms

@emmettbutler
Copy link
Collaborator

@brettlangdon let me know when this is ready for review. As far as I can tell from the scrollback, you're still working on it?

@brettlangdon brettlangdon marked this pull request as draft April 9, 2024 15:03
ddyurchenko and others added 4 commits April 30, 2024 12:29
…unner (#8917)

## Checklist

- [ ] Change(s) are motivated and described in the PR description
- [ ] Testing strategy is described if automated tests are not included
in the PR
- [ ] Risks are described (performance impact, potential for breakage,
maintainability)
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [ ] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [ ] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [ ] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

## Reviewer Checklist

- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
@brettlangdon brettlangdon marked this pull request as ready for review May 1, 2024 11:40
@brettlangdon
Copy link
Member Author

@emmettbutler this is now ready for review.

@brettlangdon brettlangdon enabled auto-merge (squash) May 3, 2024 14:01
@brettlangdon brettlangdon merged commit d837ff5 into main May 3, 2024
45 checks passed
@brettlangdon brettlangdon deleted the brettlangdon/simplify.flask_simple branch May 3, 2024 14:04
brettlangdon added a commit that referenced this pull request May 14, 2024
…ead of gunicorn (#9253)

Follow up from #8902 to also convert the `flask_sqli` benchmark to using
the Flask test client instead of gunicorn.

This cuts the benchmark runtime from 25min to 5min.

We should be testing the same code paths, but we avoid the need to spin
up a subprocess/server and make network requests to it.

We also refactored some common bits from both `flask_simple` and
`flask_sqli` to ensure we are configuring them the same way.

This will look like a performance improvement, but it isn't. It is the
test itself getting faster.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

## Reviewer Checklist

- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/no-changelog A changelog entry is not required for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants