Fix bench-e2e single mode and keep results #1693

ch1bo · 2024-10-08T17:49:24Z

This fixes two issues with the bench-e2e binary / benchmark:

Running in single mode was not working because of a FeeTooSmallUTxO error
The results.csv is written into a temporary directory and removed, which makes plotting impossible.

I was in the mood of some refactoring so this contains also various other changes I encountered while working on the code and I was tidying up a bit.

The refactoring separated hydra node and payment keys further, which requires the datasets to be re-generated. I took the freedom to generate with --scaling-factor 10 which results in 300 transactions per client. Should be long enough to identify regressions, with hopefully 10x shorter benchmark time in CI.

Another benefit of this separation is that it naturally led to reducing the assumptions of the demo mode by not seeding the hydra node cardano keys, but re-using seed-devnet.sh and consequently looser coupling between the workload and container setup in our network test workflow.

I'm not 100% happy with how the bench is now requiring the --output-directory to be empty, and in turn the whole state will be captured as an artifact of our CI. Instead, making the state directory always a /tmp path and retained in case of errors (or configurable with --state-directory) would be better. But that can go into another PR .. another time.

CHANGELOG updated
Documentation updatedx (README)
Haddocks updated
No new TODOs introduced or explained herafter
- Two XXX notes of what to improve further

github-actions · 2024-10-08T18:15:06Z

Transaction costs

Sizes and execution budgets for Hydra protocol transactions. Note that unlisted parameters are currently using arbitrary values and results are not fully deterministic and comparable to previous runs.

Metadata
Generated at	2024-10-10 09:37:18.910092424 UTC
Max. memory units	14000000
Max. CPU units	10000000000
Max. tx size (kB)	16384

Script summary

Name	Hash	Size (Bytes)
νInitial	b512161ccb0652d7e9a0b540e4a3c808f73d6558a4bcabf374d85880	3969
νCommit	ea444d37d226e71eef73ac78d149750da977feb588900135bf9e8221	692
νHead	2253ddd95837c7aacc8635a971caaea743434152dd8dd2849bdf4162	10797
μHead	4d648ca239040b0e87901835aa11423e7aa3bd947ce6befe7db1bae8*	4508
νDeposit	1a011f23b139a6426767026bde10319546485d553219a5848cdac4e5	2993

The minting policy hash is only usable for comparison. As the script is parameterized, the actual script is unique per head.

`Init` transaction costs

Parties	Tx size	% max Mem	% max CPU	Min fee ₳
1	5097	5.81	2.30	0.44
2	5298	7.31	2.90	0.46
3	5502	8.46	3.34	0.48
5	5902	11.12	4.39	0.53
10	6904	18.21	7.20	0.65
57	16356	82.99	32.83	1.78

`Commit` transaction costs

This uses ada-only outputs for better comparability.

UTxO	Tx size	% max Mem	% max CPU	Min fee ₳
1	569	10.84	4.26	0.29
2	756	14.31	5.80	0.34
3	947	17.92	7.39	0.39
5	1317	25.56	10.73	0.49
10	2244	47.11	19.97	0.77
19	3931	94.71	39.81	1.38

`CollectCom` transaction costs

Parties	UTxO (bytes)	Tx size	% max Mem	% max CPU	Min fee ₳
1	57	560	19.87	7.59	0.39
2	114	675	26.96	10.28	0.47
3	171	782	37.47	14.23	0.59
4	226	893	47.02	17.85	0.70
5	282	1004	50.82	19.36	0.74
6	339	1120	56.62	21.60	0.81
7	393	1227	71.22	27.05	0.97
8	448	1338	86.33	32.70	1.14
9	506	1449	84.03	32.00	1.12
10	561	1560	97.90	37.23	1.28

Cost of Decrement Transaction

Parties	Tx size	% max Mem	% max CPU	Min fee ₳
1	645	18.36	8.06	0.39
2	731	18.54	8.85	0.40
3	914	20.64	10.42	0.43
5	1166	23.52	13.10	0.49
10	2022	33.73	20.90	0.66
48	7752	98.66	75.63	1.83

`Close` transaction costs

Parties	Tx size	% max Mem	% max CPU	Min fee ₳
1	677	20.87	9.35	0.42
2	798	22.27	10.77	0.44
3	924	23.78	12.21	0.47
5	1177	26.68	15.09	0.53
10	2053	35.45	23.61	0.70
50	8080	99.10	86.05	1.93

`Contest` transaction costs

Parties	Tx size	% max Mem	% max CPU	Min fee ₳
1	694	26.81	11.49	0.48
2	863	28.94	13.36	0.52
3	1027	31.08	15.22	0.56
5	1162	33.45	17.35	0.60
10	2139	44.52	27.08	0.80
38	6462	98.52	74.61	1.77

`Abort` transaction costs

There is some variation due to the random mixture of initial and already committed outputs.

Parties	Tx size	% max Mem	% max CPU	Min fee ₳
1	4996	15.34	6.57	0.54
2	5099	21.81	9.38	0.62
3	5208	28.05	12.11	0.69
4	5308	32.16	13.78	0.74
5	5491	42.56	18.48	0.87
6	5692	48.86	21.35	0.95
7	5778	53.30	23.14	1.00
8	5882	59.79	25.97	1.08
9	5911	68.90	29.86	1.18
10	6340	77.11	33.84	1.30
11	6171	78.71	34.01	1.30
12	6554	94.06	41.03	1.49
13	6713	95.82	41.83	1.52
14	6666	96.77	42.03	1.53

`FanOut` transaction costs

Involves spending head output and burning head tokens. Uses ada-only UTxO for better comparability.

Parties	UTxO	UTxO (bytes)	Tx size	% max Mem	% max CPU	Min fee ₳
10	0	0	5089	9.99	4.18	0.49
10	1	57	5123	11.64	5.10	0.51
10	5	285	5259	16.11	7.91	0.57
10	10	570	5429	21.67	11.41	0.65
10	20	1140	5769	33.94	18.89	0.81
10	30	1707	6109	44.65	25.71	0.97
10	40	2274	6445	57.33	33.37	1.14
10	50	2847	6788	68.44	40.36	1.29
10	76	4326	7669	98.68	59.14	1.71

End-to-end benchmark results

This page is intended to collect the latest end-to-end benchmark results produced by Hydra's continuous integration (CI) system from the latest master code.

Please note that these results are approximate as they are currently produced from limited cloud VMs and not controlled hardware. Rather than focusing on the absolute results, the emphasis should be on relative results, such as how the timings for a scenario evolve as the code changes.

Generated at 2024-10-10 09:39:06.434594779 UTC

Baseline Scenario

Number of nodes	1
Number of txs	300
Avg. Confirmation Time (ms)	5.392762173
P99	10.46269543ms
P95	7.2618078000000015ms
P50	5.2415315ms
Number of Invalid txs	0

Three local nodes

Number of nodes	3
Number of txs	900
Avg. Confirmation Time (ms)	24.295820565
P99	42.294148189999994ms
P95	34.12213889999999ms
P50	22.8653795ms
Number of Invalid txs	0

github-actions · 2024-10-08T18:20:14Z

Test Results

544 tests ±0 538 ✅ ±0 28m 8s ⏱️ + 1m 38s
162 suites ±0 6 💤 ±0
7 files ±0 0 ❌ ±0

Results for commit a48d056. ± Comparison against base commit dff6655.

♻️ This comment has been updated with latest results.

hydra-cluster/bench/Bench/EndToEnd.hs

hydra-cluster/hydra-cluster.cabal

noonio

Minor comments; happy to merge if all the tests pass!

noonio · 2024-10-09T09:53:11Z

In fact the network tests are failing - https://github.com/cardano-scaling/hydra/actions/runs/11241196905/job/31255255123?pr=1693

This is not ideal, but a lot simpler than doing proper fee calculation. It's unclear why fee calculation was removed before, it is needed when running benchmark scenarios.

This is redundant and can be achieved by using the 'datasets' subcommand.

Before it was written to a random temporary directory, which makes it annoying to generate datasets with this mode.

They hydra-cluster benchmarks now only uses a single directory to store the whole state, which is temporary unless a specific output-directory is requested.

This reduces some code duplication without much loss of expressiveness (which key we use does not matter).

Same transaction style (single repending txs), but deliberately smaller length of transactions (3000 -> 300) to have shorter benchmark run-times, while sequence should be long enough to identify regressions. Generated with invocations: cabal run bench-e2e -- single --cluster-size 1 --scaling-factor 10 and cabal run bench-e2e -- single --cluster-size 3 --scaling-factor 10 Plus some manual amending of the JSON to contain a "title".

As before, the bench-e2e does not assume the hydra node keys to be seeded. This ties the way bench-e2e binary (which hard-codes Alice, Bob and Carol) to the configurable list of --hydra-client to connect to.

This decouples the bench-e2e binary which just produces load and provides statistics more from how the hydra-nodes are run. Now the only assumption is that the 'hydra-cluster/config/credentials/faucet.sk' owns funds on the given network.

ch1bo force-pushed the fix-bench-standalone branch 3 times, most recently from 506062b to 9eb745d Compare October 8, 2024 18:09

ch1bo self-assigned this Oct 8, 2024

ch1bo requested a review from a team October 8, 2024 18:11

ch1bo added the red bin label Oct 8, 2024

noonio reviewed Oct 9, 2024

View reviewed changes

hydra-cluster/bench/Bench/EndToEnd.hs Show resolved Hide resolved

noonio reviewed Oct 9, 2024

View reviewed changes

hydra-cluster/hydra-cluster.cabal Show resolved Hide resolved

noonio approved these changes Oct 9, 2024

View reviewed changes

noonio force-pushed the fix-bench-standalone branch 2 times, most recently from 0b81249 to 1dbd899 Compare October 9, 2024 12:29

ch1bo force-pushed the fix-bench-standalone branch from 1dbd899 to 5b34aa0 Compare October 9, 2024 12:32

ch1bo and others added 15 commits October 10, 2024 11:33

Fix hydra-cluster bench single by setting a hard-coded fee

dbc0d84

This is not ideal, but a lot simpler than doing proper fee calculation. It's unclear why fee calculation was removed before, it is needed when running benchmark scenarios.

Remove redundant bench-e2e mode of single + workdir set

ea1d38f

This is redundant and can be achieved by using the 'datasets' subcommand.

Write generated dataset to outputDirectory if given

d905ddb

Before it was written to a random temporary directory, which makes it annoying to generate datasets with this mode.

Write results.csv in --output-directory

8f878a0

They hydra-cluster benchmarks now only uses a single directory to store the whole state, which is temporary unless a specific output-directory is requested.

Small module re-org in HydraNode.hs

2b1dfe5

Drop need of party in benchmark scenario

215bc71

Use a bit more non empty lists in hydra-cluster

46ee433

Switch to only use self-transfers in benchmarks

a6f74a6

This reduces some code duplication without much loss of expressiveness (which key we use does not matter).

Separate client keys and hydra node keys

f4733d4

Fail if output directory is not empty

2c5d063

Start with an empty output directory

9317e88

Only show quantiles if they can be computed

b0bade5

Use and seed hydraNodeKeys in demo mode

5a4718c

As before, the bench-e2e does not assume the hydra node keys to be seeded. This ties the way bench-e2e binary (which hard-codes Alice, Bob and Carol) to the configurable list of --hydra-client to connect to.

ch1bo force-pushed the fix-bench-standalone branch from 4b50543 to a48d056 Compare October 10, 2024 09:33

ch1bo added this pull request to the merge queue Oct 10, 2024

Add a comment to the benchmark scenario

ec21ac0

ch1bo removed this pull request from the merge queue due to a manual request Oct 10, 2024

ch1bo enabled auto-merge October 10, 2024 10:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bench-e2e single mode and keep results #1693

Fix bench-e2e single mode and keep results #1693

ch1bo commented Oct 8, 2024 •

edited

Loading

github-actions bot commented Oct 8, 2024 •

edited

Loading

github-actions bot commented Oct 8, 2024 •

edited

Loading

noonio left a comment

noonio commented Oct 9, 2024

Fix bench-e2e single mode and keep results #1693

Are you sure you want to change the base?

Fix bench-e2e single mode and keep results #1693

Conversation

ch1bo commented Oct 8, 2024 • edited Loading

github-actions bot commented Oct 8, 2024 • edited Loading

Transaction costs

Script summary

Init transaction costs

Commit transaction costs

CollectCom transaction costs

Cost of Decrement Transaction

Close transaction costs

Contest transaction costs

Abort transaction costs

FanOut transaction costs

End-to-end benchmark results

Baseline Scenario

Three local nodes

github-actions bot commented Oct 8, 2024 • edited Loading

Test Results

noonio left a comment

Choose a reason for hiding this comment

noonio commented Oct 9, 2024

ch1bo commented Oct 8, 2024 •

edited

Loading

github-actions bot commented Oct 8, 2024 •

edited

Loading

`Init` transaction costs

`Commit` transaction costs

`CollectCom` transaction costs

`Close` transaction costs

`Contest` transaction costs

`Abort` transaction costs

`FanOut` transaction costs

github-actions bot commented Oct 8, 2024 •

edited

Loading