feat(ci): regression detection overhaul #21567

pront · 2024-10-21T14:40:02Z

Summary

This PR introduces a new regression detection workflow, quite similar to the existing one. However, there are a few notable changes here, this workflow:

Runs on a nightly schedule.
Supports inputs, a base SHA and a comparison SHA (this will help when bisecting commits).
Completely removes all logic related to PRs. The expectation is to examine the nightly results in the ops channel or run ad-hoc workflows.

Test plan

I tested the input resolution and validation on a fork.
However, this needs more E2E testing.
To avoid disruptions, I am creating this under a different name with the intention to replace the existing workflow.

pront · 2024-10-21T14:57:30Z

.github/workflows/regression_v2.yml

+#   - The comparison SHA:
+#     - If not specified, the current HEAD of origin/master is used.
+#
+# This workflow runs regression detection experiments, performing relative
+# evaluations of the baseline SHA and comparison SHA. The exact SHAs are determined
+# by how the workflow is triggered.
+#
+# The goal is to provide quick feedback on Vector's performance across a variety
+# of configurations, checking if throughput performance has degraded or become
+# more variable in the comparison SHA relative to the baseline SHA.
+#
+# Docker image tags are based on the resolved SHAs.
+
+name: Regression Detection Suite (new)
+
+on:
+  workflow_dispatch:
+    inputs:
+      baseline-sha:
+        description: "The SHA to use as the baseline (optional). If not provided, it defaults to the SHA from 24 hours ago."
+        required: false
+      comparison-sha:
+        description: "The SHA to use for comparison (optional). If not provided, it defaults to the current HEAD of the origin/master branch."
+        required: false
+  schedule:
+    - cron: '0 6 * * 1-5' # Runs at 6 AM UTC on weekdays (Monday to Friday)
+
+env:
+  SINGLE_MACHINE_PERFORMANCE_API: ${{ secrets.SINGLE_MACHINE_PERFORMANCE_API }}
+  SMP_WARMUP_SECONDS: 70 # default is 45 seconds
+
+jobs:
+
+  resolve-inputs:
+    runs-on: ubuntu-latest
+    outputs:
+      baseline-sha: ${{ steps.set_and_validate_shas.outputs.BASELINE_SHA }}
+      comparison-sha: ${{ steps.set_and_validate_shas.outputs.COMPARISON_SHA }}
+      baseline-tag: ${{ steps.set_and_validate_shas.outputs.BASELINE_TAG }}
+      comparison-tag: ${{ steps.set_and_validate_shas.outputs.COMPARISON_TAG }}
+      smp-version: ${{ steps.experimental-meta.outputs.SMP_CRATE_VERSION }}
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0  # need to pull repository history to find merge bases
+
+      - name: Set and Validate SHAs
+        id: set_and_validate_shas
+        run: |
+          # Set baseline SHA
+          if [ -z "${{ github.event.inputs.baseline-sha }}" ]; then
+            BASELINE_SHA=$(git rev-list -n 1 --before="24 hours ago" origin/master)
+            echo "Using baseline SHA from 24 hours ago: ${BASELINE_SHA}"
+          else
+            BASELINE_SHA="${{ github.event.inputs.baseline-sha }}"
+            echo "Using provided baseline SHA: ${BASELINE_SHA}"
+          fi
+
+          # Validate baseline SHA
+          if [ -n "${BASELINE_SHA}" ] && git cat-file -e "${BASELINE_SHA}^{commit}"; then
+            echo "Baseline SHA is valid."
+          else
+            echo "Invalid baseline SHA: ${BASELINE_SHA}."
+            exit 1
+          fi
+
+          # Set comparison SHA
+          if [ -z "${{ github.event.inputs.comparison-sha }}" ]; then
+            COMPARISON_SHA=$(git rev-parse origin/master)
+            echo "Using current HEAD for comparison SHA: ${COMPARISON_SHA}"
+          else
+            COMPARISON_SHA="${{ github.event.inputs.comparison-sha }}"
+            echo "Using provided comparison SHA: ${COMPARISON_SHA}"
+          fi
+
+          # Validate comparison SHA
+          if [ -n "${COMPARISON_SHA}" ] && git cat-file -e "${COMPARISON_SHA}^{commit}"; then
+            echo "Comparison SHA is valid."
+          else
+            echo "Invalid comparison SHA: ${COMPARISON_SHA}."
+            exit 1
+          fi
+
+          # Set tags and export them
+          BASELINE_TAG="workflow_dispatch-${COMPARISON_SHA}-${BASELINE_SHA}"
+          COMPARISON_TAG="workflow_dispatch-${COMPARISON_SHA}-${COMPARISON_SHA}"
+
+          echo "BASELINE_SHA=${BASELINE_SHA}" >> $GITHUB_OUTPUT
+          echo "COMPARISON_SHA=${COMPARISON_SHA}" >> $GITHUB_OUTPUT
+
+          echo "BASELINE_TAG=${BASELINE_TAG}" >> $GITHUB_OUTPUT
+          echo "COMPARISON_TAG=${COMPARISON_TAG}" >> $GITHUB_OUTPUT
+
+      - name: Set SMP version
+        id: experimental-meta
+        run: |
+          export SMP_CRATE_VERSION="0.16.1"
+          echo "smp crate version: ${SMP_CRATE_VERSION}"
+          echo "SMP_CRATE_VERSION=${SMP_CRATE_VERSION}" >> $GITHUB_OUTPUT


Note for the reviewers:

Lines [1,109] are the main changes.

The job compute-metadata was merged into this one.

Anything related to pull_request and pull_request_review events was deleted.

pront · 2024-10-21T15:02:12Z

.github/workflows/regression_v2.yml

+      - analyze-experiment
+    env:
+      FAILED: ${{ contains(needs.*.result, 'failure') }}
+    steps:


A lot of steps were deleted here since we no longer have a PR to update.

pront · 2024-10-21T15:05:21Z

.github/workflows/regression_v2.yml

+          echo "SOURCE_CHANGED='${SOURCE_CHANGED}'"
+          echo "SOURCE_CHANGED=${SOURCE_CHANGED}" >> $GITHUB_OUTPUT
+
+  should-run-gate:


AFAIK there is no way to stop the whole workflow and mark it as skipped. So this job is used a dependency for all following jobs.

pront · 2024-10-21T15:49:46Z

.github/workflows/regression_v2.yml

+
+jobs:
+
+  resolve-inputs:


Tested this here:

https://github.com/pront/vector/actions/runs/11443826581/job/31837368366 (automatic)

https://github.com/pront/vector/actions/runs/11443847360/job/31837437076 (specified shas)

datadog-vectordotdev · 2024-10-21T16:28:15Z

Datadog Report

Branch report: pront/regression-workflow-v2
Commit report: 4a9dff3
Test service: vector

✅ 0 Failed, 7 Passed, 0 Skipped, 25.44s Total Time

pront force-pushed the pront/regression-workflow-v2 branch 2 times, most recently from c9be068 to 1419824 Compare October 21, 2024 14:42

pront commented Oct 21, 2024

View reviewed changes

pront force-pushed the pront/regression-workflow-v2 branch from 1419824 to 84661ff Compare October 21, 2024 15:02

pront commented Oct 21, 2024

View reviewed changes

--no-edit

c00bad5

pront force-pushed the pront/regression-workflow-v2 branch from 84661ff to c00bad5 Compare October 21, 2024 15:50

pront requested a review from jszwedko October 21, 2024 15:54

pront marked this pull request as ready for review October 21, 2024 15:54

pront requested a review from a team as a code owner October 21, 2024 15:54

pront added the no-changelog Changes in this PR do not need user-facing explanations in the release changelog label Oct 21, 2024

remove trailing spaces

c2c0465

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ci): regression detection overhaul #21567

feat(ci): regression detection overhaul #21567

pront commented Oct 21, 2024

pront Oct 21, 2024 •

edited

Loading

pront Oct 21, 2024

pront Oct 21, 2024

pront Oct 21, 2024

datadog-vectordotdev bot commented Oct 21, 2024


		jobs:

		resolve-inputs:

feat(ci): regression detection overhaul #21567

Are you sure you want to change the base?

feat(ci): regression detection overhaul #21567

Conversation

pront commented Oct 21, 2024

Summary

Test plan

pront Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

pront Oct 21, 2024

Choose a reason for hiding this comment

pront Oct 21, 2024

Choose a reason for hiding this comment

pront Oct 21, 2024

Choose a reason for hiding this comment

datadog-vectordotdev bot commented Oct 21, 2024

Datadog Report

pront Oct 21, 2024 •

edited

Loading