Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bench-e2e single mode and keep results #1693

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open

Conversation

ch1bo
Copy link
Collaborator

@ch1bo ch1bo commented Oct 8, 2024

This fixes two issues with the bench-e2e binary / benchmark:

  • Running in single mode was not working because of a FeeTooSmallUTxO error
  • The results.csv is written into a temporary directory and removed, which makes plotting impossible.

I was in the mood of some refactoring so this contains also various other changes I encountered while working on the code and I was tidying up a bit.

The refactoring separated hydra node and payment keys further, which requires the datasets to be re-generated. I took the freedom to generate with --scaling-factor 10 which results in 300 transactions per client. Should be long enough to identify regressions, with hopefully 10x shorter benchmark time in CI.

Another benefit of this separation is that it naturally led to reducing the assumptions of the demo mode by not seeding the hydra node cardano keys, but re-using seed-devnet.sh and consequently looser coupling between the workload and container setup in our network test workflow.

I'm not 100% happy with how the bench is now requiring the --output-directory to be empty, and in turn the whole state will be captured as an artifact of our CI. Instead, making the state directory always a /tmp path and retained in case of errors (or configurable with --state-directory) would be better. But that can go into another PR .. another time.


  • CHANGELOG updated
  • Documentation updatedx (README)
  • Haddocks updated
  • No new TODOs introduced or explained herafter
    • Two XXX notes of what to improve further

@ch1bo ch1bo force-pushed the fix-bench-standalone branch 3 times, most recently from 506062b to 9eb745d Compare October 8, 2024 18:09
@ch1bo ch1bo self-assigned this Oct 8, 2024
@ch1bo ch1bo requested a review from a team October 8, 2024 18:11
@ch1bo ch1bo added the red bin label Oct 8, 2024
Copy link

github-actions bot commented Oct 8, 2024

Transaction costs

Sizes and execution budgets for Hydra protocol transactions. Note that unlisted parameters are currently using arbitrary values and results are not fully deterministic and comparable to previous runs.

Metadata
Generated at 2024-10-10 09:37:18.910092424 UTC
Max. memory units 14000000
Max. CPU units 10000000000
Max. tx size (kB) 16384

Script summary

Name Hash Size (Bytes)
νInitial b512161ccb0652d7e9a0b540e4a3c808f73d6558a4bcabf374d85880 3969
νCommit ea444d37d226e71eef73ac78d149750da977feb588900135bf9e8221 692
νHead 2253ddd95837c7aacc8635a971caaea743434152dd8dd2849bdf4162 10797
μHead 4d648ca239040b0e87901835aa11423e7aa3bd947ce6befe7db1bae8* 4508
νDeposit 1a011f23b139a6426767026bde10319546485d553219a5848cdac4e5 2993
  • The minting policy hash is only usable for comparison. As the script is parameterized, the actual script is unique per head.

Init transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 5097 5.81 2.30 0.44
2 5298 7.31 2.90 0.46
3 5502 8.46 3.34 0.48
5 5902 11.12 4.39 0.53
10 6904 18.21 7.20 0.65
57 16356 82.99 32.83 1.78

Commit transaction costs

This uses ada-only outputs for better comparability.

UTxO Tx size % max Mem % max CPU Min fee ₳
1 569 10.84 4.26 0.29
2 756 14.31 5.80 0.34
3 947 17.92 7.39 0.39
5 1317 25.56 10.73 0.49
10 2244 47.11 19.97 0.77
19 3931 94.71 39.81 1.38

CollectCom transaction costs

Parties UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳
1 57 560 19.87 7.59 0.39
2 114 675 26.96 10.28 0.47
3 171 782 37.47 14.23 0.59
4 226 893 47.02 17.85 0.70
5 282 1004 50.82 19.36 0.74
6 339 1120 56.62 21.60 0.81
7 393 1227 71.22 27.05 0.97
8 448 1338 86.33 32.70 1.14
9 506 1449 84.03 32.00 1.12
10 561 1560 97.90 37.23 1.28

Cost of Decrement Transaction

Parties Tx size % max Mem % max CPU Min fee ₳
1 645 18.36 8.06 0.39
2 731 18.54 8.85 0.40
3 914 20.64 10.42 0.43
5 1166 23.52 13.10 0.49
10 2022 33.73 20.90 0.66
48 7752 98.66 75.63 1.83

Close transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 677 20.87 9.35 0.42
2 798 22.27 10.77 0.44
3 924 23.78 12.21 0.47
5 1177 26.68 15.09 0.53
10 2053 35.45 23.61 0.70
50 8080 99.10 86.05 1.93

Contest transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 694 26.81 11.49 0.48
2 863 28.94 13.36 0.52
3 1027 31.08 15.22 0.56
5 1162 33.45 17.35 0.60
10 2139 44.52 27.08 0.80
38 6462 98.52 74.61 1.77

Abort transaction costs

There is some variation due to the random mixture of initial and already committed outputs.

Parties Tx size % max Mem % max CPU Min fee ₳
1 4996 15.34 6.57 0.54
2 5099 21.81 9.38 0.62
3 5208 28.05 12.11 0.69
4 5308 32.16 13.78 0.74
5 5491 42.56 18.48 0.87
6 5692 48.86 21.35 0.95
7 5778 53.30 23.14 1.00
8 5882 59.79 25.97 1.08
9 5911 68.90 29.86 1.18
10 6340 77.11 33.84 1.30
11 6171 78.71 34.01 1.30
12 6554 94.06 41.03 1.49
13 6713 95.82 41.83 1.52
14 6666 96.77 42.03 1.53

FanOut transaction costs

Involves spending head output and burning head tokens. Uses ada-only UTxO for better comparability.

Parties UTxO UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳
10 0 0 5089 9.99 4.18 0.49
10 1 57 5123 11.64 5.10 0.51
10 5 285 5259 16.11 7.91 0.57
10 10 570 5429 21.67 11.41 0.65
10 20 1140 5769 33.94 18.89 0.81
10 30 1707 6109 44.65 25.71 0.97
10 40 2274 6445 57.33 33.37 1.14
10 50 2847 6788 68.44 40.36 1.29
10 76 4326 7669 98.68 59.14 1.71

End-to-end benchmark results

This page is intended to collect the latest end-to-end benchmark results produced by Hydra's continuous integration (CI) system from the latest master code.

Please note that these results are approximate as they are currently produced from limited cloud VMs and not controlled hardware. Rather than focusing on the absolute results, the emphasis should be on relative results, such as how the timings for a scenario evolve as the code changes.

Generated at 2024-10-10 09:39:06.434594779 UTC

Baseline Scenario

Number of nodes 1
Number of txs 300
Avg. Confirmation Time (ms) 5.392762173
P99 10.46269543ms
P95 7.2618078000000015ms
P50 5.2415315ms
Number of Invalid txs 0

Three local nodes

Number of nodes 3
Number of txs 900
Avg. Confirmation Time (ms) 24.295820565
P99 42.294148189999994ms
P95 34.12213889999999ms
P50 22.8653795ms
Number of Invalid txs 0

Copy link

github-actions bot commented Oct 8, 2024

Test Results

544 tests  ±0   538 ✅ ±0   28m 8s ⏱️ + 1m 38s
162 suites ±0     6 💤 ±0 
  7 files   ±0     0 ❌ ±0 

Results for commit a48d056. ± Comparison against base commit dff6655.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@noonio noonio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments; happy to merge if all the tests pass!

@noonio
Copy link
Contributor

noonio commented Oct 9, 2024

@noonio noonio force-pushed the fix-bench-standalone branch 2 times, most recently from 0b81249 to 1dbd899 Compare October 9, 2024 12:29
ch1bo and others added 15 commits October 10, 2024 11:33
This is not ideal, but a lot simpler than doing proper fee calculation.
It's unclear why fee calculation was removed before, it is needed when
running benchmark scenarios.
This is redundant and can be achieved by using the 'datasets'
subcommand.
Before it was written to a random temporary directory, which makes it
annoying to generate datasets with this mode.
They hydra-cluster benchmarks now only uses a single directory to store
the whole state, which is temporary unless a specific output-directory
is requested.
This reduces some code duplication without much loss of
expressiveness (which key we use does not matter).
Same transaction style (single repending txs), but deliberately smaller
length of transactions (3000 -> 300) to have shorter benchmark
run-times, while sequence should be long enough to identify regressions.

Generated with invocations:

cabal run bench-e2e -- single --cluster-size 1 --scaling-factor 10

and

cabal run bench-e2e -- single --cluster-size 3 --scaling-factor 10

Plus some manual amending of the JSON to contain a "title".
As before, the bench-e2e does not assume the hydra node keys to be
seeded. This ties the way bench-e2e binary (which hard-codes Alice, Bob
and Carol) to the configurable list of --hydra-client to connect to.
This decouples the bench-e2e binary which just produces load and
provides statistics more from how the hydra-nodes are run.

Now the only assumption is that the
'hydra-cluster/config/credentials/faucet.sk' owns funds on the given
network.
@ch1bo ch1bo added this pull request to the merge queue Oct 10, 2024
@ch1bo ch1bo removed this pull request from the merge queue due to a manual request Oct 10, 2024
@ch1bo ch1bo enabled auto-merge October 10, 2024 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants