chore(ci): cleanup benchmarking scripts and add profiling capability (#…

…9419) Main change here is adding `viztracer` and a way to generate profiles from benchmark scenario runs. List of changes: - Adding `PROFILE_BENCHMARKS=1` env var to trigger generating viztracer profiles for each scenario and putting the results in artifacts directory - Support supplying `.` as a ddtrace version to install local version mounted to `/src/` in the benchmark container - Run the benchmark container with `--network host` to allow connecting to a local trace agent for the scenarios which rely on an agent (`flask_simple`, etc) - Make sure latest version of `pip` is present (not *that* important, I just saw a version upgrade notice, so figured it doesn't hurt) ## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. ## Reviewer Checklist - [ ] Title is accurate - [ ] All changes are related to the pull request's stated goal - [ ] Description motivates each change - [ ] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [ ] Testing strategy adequately addresses listed risks - [ ] Change is maintainable (easy to change, telemetry, documentation) - [ ] Release note makes sense to a user of the library - [ ] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [ ] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
DataDog · May 29, 2024 · 110f4e4 · 110f4e4
1 parent 69d1d3c
commit 110f4e4
Show file tree

Hide file tree

Showing 6 changed files with 106 additions and 6 deletions.
diff --git a/benchmarks/Dockerfile b/benchmarks/Dockerfile
@@ -20,6 +20,7 @@ COPY --from=base /pyenv /pyenv
 ENV PYENV_ROOT "/pyenv"
 ENV PATH "$PYENV_ROOT/shims:$PYENV_ROOT/bin:/root/.cargo/bin/:$PATH"
 RUN pyenv global "$PYTHON_VERSION"
+RUN pip install -U pip
 
 ARG SCENARIO=base
 
@@ -59,6 +60,7 @@ COPY ./bm/ /app/bm/
 COPY ./${SCENARIO}/ /app/
 
 ENV SCENARIO=${SCENARIO}
+ENV PROFILE_BENCHMARKS=0
 
 ENTRYPOINT ["/app/entrypoint"]
 CMD ["/app/benchmark"]
diff --git a/benchmarks/README.rst b/benchmarks/README.rst
@@ -62,16 +62,72 @@ The scenario can be run using the built image to compare two versions of the lib
 
   scripts/perf-run-scenario <scenario> <version> <version> <artifacts>
 
-The version specifiers can reference published versions on PyPI or git
-repositories.
+The version specifiers can reference published versions on PyPI, git repositories, or `.` for your local version.
 
 Example::
 
+  # Compare PyPI versions 0.50.0 vs 0.51.0
   scripts/perf-run-scenario span ddtrace==0.50.0 ddtrace==0.51.0 ./artifacts/
+
+  # Compare PyPI version 0.50.0 vs your local changes
+  scripts/perf-run-scenario span ddtrace==0.50.0 . ./artifacts/
+
+  # Compare git branch 1.x vs git branch my-feature
   scripts/perf-run-scenario span Datadog/[email protected] Datadog/dd-trace-py@my-feature ./artifacts/
 
 
+Profiling
+~~~~~~~~~
+
+You may also generate profiling data from each scenario using `viztracer`_ by providing the ``PROFILE_BENCHMARKS=1`` environment variable.
+
+Example::
+
+  # Compare and profile PyPI version 2.8.4 against your local changes, and store the results in ./artifacts/
+  PROFILE_BENCHMARKS=1 scripts/perf-run-scenario span ddtrace==2.8.4 . ./artifacts/
+
+One ``viztracer`` output will be created for every scenario run in the artifacts directory.
+
+You can use the ``viztracer`` tooling to combine or inspect the resulting files locally
+
+Some examples::
+
+  # Install viztracer
+  pip install -U viztracer
+
+  # Load a specific scenario in your browser
+  vizviewer artifacts/<run-id>/<scenario_name>/<version>/viztracer/<config_name>.json
+
+  # Load a flamegraph of a specific scenario
+  vizviewer --flamegraph artifacts/<run-id>/<scenario_name>/<version>/viztracer/<config_name>.json
+
+  # Combine all processes/threads into a single flamegraph
+  jq '{"traceEvents": [.traceEvents[] | .pid = "1" | .tid = "1"]}' <config_name>.json > combined.json
+  vizviewer --flamegraph combined.json
+
+Using the ``vizviewer`` UI you can inspect the profile/timeline from each process, as well as execute SQL, like the following::
+
+  SELECT IMPORT("experimental.slices");
+  SELECT
+    name,
+    count(*) as calls,
+    sum(dur) as total_duration,
+    avg(dur) as avg_duration,
+    min(dur) as min_duration,
+    max(dur) as max_duration
+  FROM experimental_slice_with_thread_and_process_info
+  WHERE name like '%/ddtrace/%'
+  group by name
+  having calls > 500
+  order by total_duration desc
+
+
+See `viztracer`_ documentation for more details.
+
 Scenarios
 ^^^^^^^^^
 
 .. include:: ../benchmarks/threading/README.rst
+
+
+.. _viztracer: https://viztracer.readthedocs.io/en/stable/basic_usage.html#display-report
diff --git a/benchmarks/base/requirements.txt b/benchmarks/base/requirements.txt
@@ -5,3 +5,4 @@ pyyaml
 attrs
 httpretty==1.1.4
 tenacity==8.0.0
+viztracer
diff --git a/benchmarks/base/run.py b/benchmarks/base/run.py
@@ -7,14 +7,34 @@
 import yaml
 
 
+SHOULD_PROFILE = os.environ.get("PROFILE_BENCHMARKS", "0") == "1"
+
+
 def read_config(path):
     with open(path, "r") as fp:
         return yaml.load(fp, Loader=yaml.FullLoader)
 
 
 def run(scenario_py, cname, cvars, output_dir):
-    cmd = [
-        "python",
+    if SHOULD_PROFILE:
+        # viztracer won't create the missing directory itself
+        viztracer_output_dir = os.path.join(output_dir, "viztracer")
+        os.makedirs(viztracer_output_dir, exist_ok=True)
+
+        cmd = [
+            "viztracer",
+            "--minimize_memory",
+            "--min_duration",
+            "5",
+            "--max_stack_depth",
+            "200",
+            "--output_file",
+            os.path.join(output_dir, "viztracer", "{}.json".format(cname)),
+        ]
+    else:
+        cmd = ["python"]
+
+    cmd += [
         scenario_py,
         # necessary to copy PYTHONPATH for venvs
         "--copy-env",
@@ -26,6 +46,7 @@ def run(scenario_py, cname, cvars, output_dir):
     for cvarname, cvarval in cvars.items():
         cmd.append("--{}".format(cvarname))
         cmd.append(str(cvarval))
+
     proc = subprocess.Popen(cmd)
     proc.wait()
 

diff --git a/scripts/gen_circleci_config.py b/scripts/gen_circleci_config.py
@@ -92,7 +92,9 @@ def gen_build_docs(template: dict) -> None:
     """Include the docs build step if the docs have changed."""
     from needs_testrun import pr_matches_patterns
 
-    if pr_matches_patterns({"docker", "docs/*", "ddtrace/*", "scripts/docs", "releasenotes/*"}):
+    if pr_matches_patterns(
+        {"docker", "docs/*", "ddtrace/*", "scripts/docs", "releasenotes/*", "benchmarks/README.rst"}
+    ):
         template["workflows"]["test"]["jobs"].append({"build_docs": template["requires_pre_check"]})
 
 

diff --git a/scripts/perf-run-scenario b/scripts/perf-run-scenario
@@ -6,9 +6,16 @@ SCRIPTNAME=$(basename $0)
 if [[ $# -lt 3 ]]; then
     cat << EOF
 Usage: ${SCRIPTNAME} <scenario> <version> <version> [artifacts]
+
+Versions can be specified in the following formats:
+    - "ddtrace==0.51.0" - to install a specific version from PyPI
+    - "Datadog/[email protected] - to install a specific version from GitHub
+    - "." - to install the current local version
+
 Examples:
     ${SCRIPTNAME} span ddtrace==0.51.0 ddtrace==0.50.0
     ${SCRIPTNAME} span Datadog/[email protected] Datadog/[email protected]
+    ${SCRIPTNAME} span ddtrace==2.8.4 .
     ${SCRIPTNAME} span ddtrace==0.51.0 ddtrace==0.50.0 ./artifacts/
 
 EOF
@@ -22,6 +29,11 @@ function expand_git_version {
     if [[ $version =~ $gitpattern ]]; then
         version="git+https://github.com/${version}"
     fi
+
+    # If the user provides "." they want the local version, which gets mapped to `/src/` in the container
+    if [[ $version == "." ]]; then
+        version="/src/"
+    fi
     echo $version
 }
 
@@ -48,12 +60,18 @@ if [[ -n ${ARTIFACTS} ]]; then
    ARTIFACTS="$(echo $ARTIFACTS | python -c 'import os,sys; print(os.path.abspath(sys.stdin.read()))')"
    mkdir -p ${ARTIFACTS}
    docker run -it --rm \
-          -v ${ARTIFACTS}:/artifacts/ \
+          --network host \
+          -v "${ARTIFACTS}:/artifacts/" \
+          -v "$(pwd):/src/" \
+          -e PROFILE_BENCHMARKS=${PROFILE_BENCHMARKS:-0} \
           -e DDTRACE_INSTALL_V1="$(expand_git_version $DDTRACE_V1)" \
           -e DDTRACE_INSTALL_V2="$(expand_git_version $DDTRACE_V2)" \
           $TAG
 else
    docker run -it --rm \
+          --network host \
+          -v "$(pwd):/src/" \
+          -e PROFILE_BENCHMARKS=${PROFILE_BENCHMARKS:-0} \
           -e DDTRACE_INSTALL_V1="$(expand_git_version $DDTRACE_V1)" \
           -e DDTRACE_INSTALL_V2="$(expand_git_version $DDTRACE_V2)" \
           $TAG