Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding package-benchmark sub-project #64

Merged
merged 7 commits into from
Oct 10, 2024

Conversation

heckj
Copy link
Contributor

@heckj heckj commented Oct 8, 2024

Description

adds a package within the project to capture a set up benchmarks based on what's in FirebladeECSPerformanceTests/TypedFamilyPerformanceTests.swift, converted to leverage package-benchmark.

  • It doesn't remove any existing performance tests, only replicates a subset.
  • I added it as a "subproject" to that the package-benchmark dependency doesn't flow to any consumers of the package, keeping it constrained, while also leveraging a baseline with a more recent version of swift (Swift 5.10)

There's no functional change or external API change with this update, it's more of a prep piece to allow me to more easily compare future changes.

Testing

To use this new sub project:

  • check out the branch
  • cd Benchmarks
  • run the command: swift package benchmark

Further details are in the README within the sub project.

Checklist

  • I've read the Contribution Guidelines
  • I've followed the coding style of the rest of the project.
  • I've added tests covering all new code paths my change adds to the project (to the extent possible).
  • I've added benchmarks covering new functionality (if appropriate).
  • I've verified that my change does not break any existing tests or introduce unexpected benchmark regressions.
  • I've updated the documentation (if appropriate).

@heckj heckj requested a review from ctreffs as a code owner October 8, 2024 22:36
@heckj
Copy link
Contributor Author

heckj commented Oct 8, 2024

An example benchmark run (on my laptop - an M1 macbook pro) of this update:

Baseline 'Current_run'

Host 'MacBookPro' with 8 'arm64' processors with 16 GB memory, running:
Darwin Kernel Version 24.0.0: Tue Sep 24 23:36:26 PDT 2024; root:xnu-11215.1.12~1/RELEASE_ARM64_T8103

ECSBenchmark

TraitMatching

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 282 282 282 282 282 283 283 47
Malloc (total) (K) * 230 230 230 230 230 230 230 47
Memory (resident peak) (M) 13 16 16 16 16 16 16 47
Throughput (# / s) (#) 49 49 48 45 44 42 42 47
Time (total CPU) (ms) * 21 21 21 22 23 24 24 47
Time (wall clock) (ms) * 21 21 21 22 23 24 24 47

TypedFamilyEntities

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 281 282 282 282 282 282 282 49
Malloc (total) (K) * 230 230 230 230 230 230 230 49
Memory (resident peak) (M) 13 16 16 16 16 16 16 49
Throughput (# / s) (#) 50 49 49 49 46 42 42 49
Time (total CPU) (ms) * 20 20 20 21 22 24 24 49
Time (wall clock) (ms) * 20 20 20 21 22 24 24 49

TypedFamilyEntityFiveComponents

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 316 317 317 317 317 317 317 43
Malloc (total) (K) * 230 230 230 230 230 230 230 43
Memory (resident peak) (M) 14 15 15 15 15 15 15 43
Throughput (# / s) (#) 44 43 43 43 42 39 39 43
Time (total CPU) (ms) * 23 23 23 23 24 25 25 43
Time (wall clock) (ms) * 23 23 23 23 24 25 25 43

TypedFamilyEntityFourComponents

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 308 308 309 309 309 309 309 43
Malloc (total) (K) * 230 230 230 230 230 230 230 43
Memory (resident peak) (M) 13 15 16 16 16 16 16 43
Throughput (# / s) (#) 45 45 44 42 40 33 33 43
Time (total CPU) (ms) * 22 22 23 24 25 28 28 43
Time (wall clock) (ms) * 22 22 23 24 25 30 30 43

TypedFamilyEntityOneComponent

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 287 287 287 287 287 287 287 47
Malloc (total) (K) * 230 230 230 230 230 230 230 47
Memory (resident peak) (M) 13 16 16 16 16 16 16 47
Throughput (# / s) (#) 47 47 47 47 46 46 46 47
Time (total CPU) (ms) * 21 21 21 21 22 22 22 47
Time (wall clock) (ms) * 21 21 21 21 22 22 22 47

TypedFamilyEntityThreeComponents

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 300 300 300 300 300 301 301 46
Malloc (total) (K) * 230 230 230 230 230 230 230 46
Memory (resident peak) (M) 14 16 16 16 16 16 16 46
Throughput (# / s) (#) 46 46 46 45 45 43 43 46
Time (total CPU) (ms) * 22 22 22 22 22 23 23 46
Time (wall clock) (ms) * 22 22 22 22 22 23 23 46

TypedFamilyEntityTwoComponents

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 295 296 296 296 296 296 296 46
Malloc (total) (K) * 230 230 230 230 230 230 230 46
Memory (resident peak) (M) 13 16 16 16 16 16 16 46
Throughput (# / s) (#) 47 47 47 46 44 40 40 46
Time (total CPU) (ms) * 21 21 21 22 22 24 24 46
Time (wall clock) (ms) * 21 21 21 22 22 25 25 46

TypedFamilyFiveComponents

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 317 317 317 317 318 318 318 41
Malloc (total) (K) * 230 230 230 230 230 230 230 41
Memory (resident peak) (M) 15 16 16 16 16 16 16 41
Throughput (# / s) (#) 43 42 42 40 39 37 37 41
Time (total CPU) (ms) * 24 24 24 25 26 26 26 41
Time (wall clock) (ms) * 23 24 24 25 26 27 27 41

TypedFamilyFourComponents

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 309 309 309 309 309 309 309 44
Malloc (total) (K) * 230 230 230 230 230 230 230 44
Memory (resident peak) (M) 13 16 16 16 16 16 16 44
Throughput (# / s) (#) 45 44 44 44 42 41 41 44
Time (total CPU) (ms) * 22 22 23 23 24 25 25 44
Time (wall clock) (ms) * 22 22 23 23 24 25 25 44

TypedFamilyOneComponent

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 285 285 285 285 285 285 285 47
Malloc (total) (K) * 230 230 230 230 230 230 230 47
Memory (resident peak) (M) 13 16 16 16 16 16 16 47
Throughput (# / s) (#) 48 48 47 46 43 40 40 47
Time (total CPU) (ms) * 21 21 21 22 23 24 24 47
Time (wall clock) (ms) * 21 21 21 22 23 25 25 47

TypedFamilyThreeComponents

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 301 301 301 301 301 302 302 44
Malloc (total) (K) * 230 230 230 230 230 230 230 44
Memory (resident peak) (M) 14 16 16 16 16 16 16 44
Throughput (# / s) (#) 45 45 45 43 41 37 37 44
Time (total CPU) (ms) * 22 22 22 23 24 27 27 44
Time (wall clock) (ms) * 22 22 22 23 24 27 27 44

TypedFamilyTwoComponents

Metric p0 p25 p50 p75 p90 p99 p100 Samples
Instructions (M) * 296 296 297 297 297 297 297 45
Malloc (total) (K) * 230 230 230 230 230 230 230 45
Memory (resident peak) (M) 13 16 16 16 16 16 16 45
Throughput (# / s) (#) 46 46 45 45 42 40 40 45
Time (total CPU) (ms) * 22 22 22 22 23 25 25 45
Time (wall clock) (ms) * 22 22 22 22 24 25 25 45

@ctreffs
Copy link
Member

ctreffs commented Oct 9, 2024

Thanks for the contribution @heckj 👍 LGTM

How about we extend the CI pipeline with a benchmark step that's either run pre-release and/or when triggered manually or even on every push (at least on the free machines)?

Benchmarks/README.md Outdated Show resolved Hide resolved
@heckj heckj force-pushed the benchmark-subproject branch 2 times, most recently from 4fb8c06 to 39820d7 Compare October 9, 2024 16:19
@heckj
Copy link
Contributor Author

heckj commented Oct 9, 2024

We absolutely can tweak up the CI to do runs, make comparisons, etc. package-benchmark is great for that. The Linux (and macOS, if desired) CI will need jemalloc installed to be able to get the memory allocations information, as its not available stock - but more importantly if you want to compare on CI we'll want to leverage only the metrics that are "what CPU did you give me" agnostic, which is a more limited set (and not at easily understood in regressions from what I've seen) - macOS (Darwin) has some interesting "instructions executed" markers that can be pulled, the allocations are a good choice, etc. Otherwise you'll be getting some notably wide variances, especially with free resources in Github Resources, and even with the paid versions.

@ctreffs
Copy link
Member

ctreffs commented Oct 9, 2024

We absolutely can tweak up the CI to do runs, make comparisons, etc. package-benchmark is great for that. The Linux (and macOS, if desired) CI will need jemalloc installed to be able to get the memory allocations information, as its not available stock - but more importantly if you want to compare on CI we'll want to leverage only the metrics that are "what CPU did you give me" agnostic, which is a more limited set (and not at easily understood in regressions from what I've seen) - macOS (Darwin) has some interesting "instructions executed" markers that can be pulled, the allocations are a good choice, etc. Otherwise you'll be getting some notably wide variances, especially with free resources in Github Resources, and even with the paid versions.

Ok, so let's skip extending CI for that for now.
I'd like to integrate #67 before merging this PR. Should not be a problem for it, right?

Copy link

codecov bot commented Oct 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.91%. Comparing base (11f46f3) to head (186860c).

Additional details and impacted files
@@           Coverage Diff           @@
##           master      #64   +/-   ##
=======================================
  Coverage   96.91%   96.91%           
=======================================
  Files          24       24           
  Lines        1069     1069           
=======================================
  Hits         1036     1036           
  Misses         33       33           

Copy link
Member

@ctreffs ctreffs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run make precommit to let the linter and formatter do it's job

Benchmarks/Package.swift Outdated Show resolved Hide resolved
@ctreffs ctreffs enabled auto-merge (squash) October 10, 2024 09:11
Copy link
Member

@ctreffs ctreffs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also should exclude the Benchmarks/ from the code coverage report.
Please just add that to .codecov.yml

auto-merge was automatically disabled October 10, 2024 15:20

Head branch was pushed to by a user without write access

@ctreffs ctreffs enabled auto-merge (squash) October 10, 2024 15:59
@ctreffs ctreffs merged commit e0ee97b into fireblade-engine:master Oct 10, 2024
6 checks passed
@heckj heckj deleted the benchmark-subproject branch October 10, 2024 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants