Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide pre-compiled binary support for Broadwell (ADX, BMI2) binaries #3198

Closed
tersec opened this issue Dec 15, 2021 · 8 comments
Closed

Provide pre-compiled binary support for Broadwell (ADX, BMI2) binaries #3198

tersec opened this issue Dec 15, 2021 · 8 comments

Comments

@tersec
Copy link
Contributor

tersec commented Dec 15, 2021

This is phrased to encompass any of several approaches to this (e.g., wholly distinct monolithic binary images, runtime-selected shared libraries, et cetera).

All block_sim runs used 2000 validators and 160 slots. They were randomly interleaved to avoid one set being coincidentally aligned with something external potentially affecting performance.

With -march=haswell, the Intel x86 microarchitecture immediately preceding Broadwell, 15 sampled times for this block_sim configuration:

81.23s
79.44s
80.08s
79.53s
77.11s
78.15s
77.02s
78.33s
77.67s
76.66s
77.34s
77.44s
79.65s
80.05s
80.14s

With -march=broadwell, 15 sampled times for this block_sim configuration:

62.25s
61.81s
61.86s
61.21s
60.70s
61.16s
61.66s
62.06s
60.22s
61.70s
60.54s
60.79s
63.36s
62.83s
61.26s

https://github.com/status-im/nimbus-eth2/blob/stable/docs/cpu_features.md#bmi2--adx explains this effect and provides further microbenchmarks.

@tersec
Copy link
Contributor Author

tersec commented May 31, 2022

https://lighthouse-book.sigmaprime.io/installation-binaries.html documents Lighthouse's approach.

@arnetheduck
Copy link
Member

my preferred solution would involve dynamic runtime selection rather than separate binaries - one approach is to fork blst such that the different platform options each get a fully separate symbol namespace (at the C level), avoiding the need to do dynamic loading (which is messy for all kinds of reasons) - we'd probably have to do a bit of C surgery to get that done, but hopefully it can also be upstreamed - there's a lot of demand for this.

@arnetheduck
Copy link
Member

their objections so far have been that the optimized operations are too "low-level" to warrant dynamic runtime selection - instead, I'd aim to get a separate symbol namespace all the way to the "black box primitives", ie sign_adx, sign_generic etc and a sign that either uses ifunc or simple pointers based on cpuid

@tersec
Copy link
Contributor Author

tersec commented Apr 23, 2023

Given that macOS 13 (Ventura) doesn't support pre-2017 hardware, and in particular, doesn't support the pre-Broadwell CPUs blocking requiring BMI2/ADX, combined with its not being a particularly common deployment, it might not make sense to focus on this in the near future, but rather specifically x86 Linux, where there's no such natural deprecation happening by the usual distributions, which still run on what's effectively 20-year-old retrocomputing with their current amd64 releases.

Because macOS 12 (Monterey) still supports the late 2013 Mac Pro, which is Haswell/pre-Broadwell, might not be able to do this in only a few months, however, but closer to a year and a half. Still seems of debatable value to devote too much attention to something that will resolve itself naturally when macOS 12 support can be dropped.

@tersec
Copy link
Contributor Author

tersec commented May 23, 2023

supranational/blst#163 would/will make this substantially more feasible.

@tersec
Copy link
Contributor Author

tersec commented Jun 13, 2023

163 was replaced by supranational/blst#165 with similar effect, and BLST has now merged run-time detection. This issue should be resolvable by updating BLST and using required settings to enable runtime detection.

@tersec
Copy link
Contributor Author

tersec commented Aug 14, 2023

https://github.com/supranational/blst/releases/tag/v0.3.11 implements runtime switching required for this. So at this point, probably no special action should be taken to implement this issue. In the due course of eventually updating BLST, this will be resolved.

@tersec tersec closed this as completed Aug 14, 2023
@tersec
Copy link
Contributor Author

tersec commented Aug 16, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants