Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vpternlogd latencies on Zen4 #29

Open
amonakov opened this issue Jul 10, 2023 · 0 comments
Open

vpternlogd latencies on Zen4 #29

amonakov opened this issue Jul 10, 2023 · 0 comments

Comments

@amonakov
Copy link

On Zen 4, summary of vpternlogd latency experiments is given as

Latency operand 1 → 1: 1
Latency operand 2 → 1: 2
Latency operand 3 → 1: 1

https://uops.info/html-lat/ZEN4/VPTERNLOGD_ZMM_ZMM_ZMM_I8-Measurements.html

but I don't see a substantial difference in 3 → 1 vs. 2 → 1 experiments, or a difference w.r.t its vpternlogq sibling, where all latencies are listed as 1. Shouldn't both dword and qword variants be listed with latency 2 for operands 2 and 3? What am I missing?

If I'm reading Agner's testing harness right, his latency experiment times

vpternlogd zmm0, zmm1, zmm2
vpternlogd zmm2, zmm1, zmm0

repeated 50 times. He lists latency of ternlog on Zen 4 as 1 cycle in all cases (but if latency from second operand is indeed 2, his experiment wouldn't uncover that).

(unfortunately I do not have access to a Zen 4 machine to run more experiments)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant