SMT pinning is broken/wrong #9

gnif · 2022-05-21T22:47:41Z

Hi, I do not use your scripts but we are seeing users in the Looking Glass discord that are who are having latency related issues due to how your script assigns CPUs to the VM.

The issue is that you are not replicating the host topology into the guest, if done properly the guest can know that the extra vCPU is sharing a core and even the L1/2/3 cache arrangement.

Here is how a guest sees a properly configured VM on a SMT host (Using Coreinfo).

Doing this the guest scheduler can make wise decisions on where to run each thread. Obviously you need to pin each CPU properly to the threads of each core to make this work well. If done correctly your cache mapping will also align with the physical hardware... see below:

Here is my host topology (AMD EPYC 7343):

My guest is pinned to CPU cores 8-15, which means

vCPU  0 &  1 = CPU  8 & 24
vCPU  2 &  3 = CPU  9 & 25
vCPU  4 &  5 = CPU 10 & 26
vCPU  6 &  7 = CPU 11 & 27
vCPU  8 &  9 = CPU 12 & 28
vCPU 10 & 11 = CPU 13 & 29
vCPU 12 & 13 = CPU 14 & 30
vCPU 14 & 15 = CPU 15 & 31

When done correctly you can see that my pinning aligns with the cache map, and allows the guest to make proper use of SMT.

Note AMD processors require the qemu CPU flag topoext so they can use SMT.
Note2: To get the cache to align you also have to set the QEMU cpu flags l3-cache=on,host-cache-info=on

The text was updated successfully, but these errors were encountered:

Onepamopa · 2022-05-22T07:41:55Z

How do you output the cpu to cache map ?

gnif · 2022-05-22T07:43:28Z

I used lstopo on Linux for this graphic, and in Windows CoreInfo from SysInternals
https://docs.microsoft.com/en-us/sysinternals/downloads/coreinfo

Onepamopa · 2022-05-22T07:55:00Z

btw, where do I set topoext & l3-cache=on,host-cache-info=on ?

gnif · 2022-05-22T07:56:48Z

Issues are to direct the author of this project to a problem with their software, not to provide you with support.

ayufan · 2022-05-22T09:44:45Z

Thank you @gnif. This is known. However, as docs says you should only pass physical threads, not virtual ones: https://github.com/ayufan/pve-helpers#21-cpu_taskset. And depending on CPU the mapping being different.

Maybe one thing being missing is documenting how to do with the L3, as when it was written there was no need to support NUMA/many-complexes scenario.

Technically it is possible to replicate all SMT topology, but at least I did not find it useful, or required to do a physical-to-virtual cpu-pinning of everything. Doing that is theoretically possible, but only libvirt supports that well.

gnif · 2022-05-22T15:10:47Z

@ayufan if I am understanding you correct, you're saying to put two VMs on the same set of cores, but separate threads? If so this is a very very bad idea, the VMs will stall each other and they will be invalidating each others caches.

According to your own documentation:

VM 1:
cpu_taskset 1-5

VM 2:
cpu_taskset 7-11

Based on the configuration there VM 1 would be on thread 1 of cores 1-5, and VM 2 would be on thread 2 of cores 1-5.

There is no such thing as a "virtual core" on the host system, both threads of a core are equal in every way, they are two identical pipelines running through and sharing some hardware that can cause them to stall each other. There is no "primary" thread, or "real" vs "virtual" thread.

If the guest OS knows about the SMT model, the guest scheduler can ensure that high priority threads like those that service interrupts for GPUs are put onto cores that can guarantee the best possible latency.

Note I am not stating this because I think it's a problem, I am stating this because it is a problem. We have people coming into our discord reporting issues with Looking Glass that are a result of very poor configuration that result due to this script. Looking Glass relies on low latency servicing of it's threads, and the GPUs driver as it's goal is to be as low latency as possible.

but at least I did not find it useful

This is just it, you did not due to your use case, but I am stating for a fact it makes a huge difference under certain workloads and you need to fix your scripts for those using such workloads, or stop promoting them.

ayufan · 2022-05-22T15:28:09Z

If so this is a very very bad idea, the VMs will stall each other and they will be invalidating each others caches.

You are fully correct. Of course they will. I can imagine this to be a problem in case of Looking Glass which requires effectively two systems to have low latency.

In my case where I don't use Looking Glass and rather use a single VM at a time, but have all of them running latency was not a problem, since other VM is mostly idle.

How you advise users to handle many VMs? Probably in this setup you expect VMs to not share physical cores, but rather pass full SMT core to them.

Anyway, I see this being a problem and happy to document those caveats. Do you have a link where best to redirect people?

gnif · 2022-05-22T15:31:43Z

In my case where I don't use Looking Glass and rather use a single VM at a time, but have all of them running latency was not a problem, since other VM is mostly idle.

In this case I would suggest you 1/2 the CPUs you give to your VMs and give them both threads of each cores, you will see a general performance uplift due to better management of your hardware.

Do you have a link where best to redirect people?

Not really as we are just supporting people reporting issues with LG. Perhaps the VFIO discord/reddit?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SMT pinning is broken/wrong #9

SMT pinning is broken/wrong #9

gnif commented May 21, 2022 •

edited

Loading

Onepamopa commented May 22, 2022

gnif commented May 22, 2022

Onepamopa commented May 22, 2022

gnif commented May 22, 2022

ayufan commented May 22, 2022 •

edited

Loading

gnif commented May 22, 2022 •

edited

Loading

ayufan commented May 22, 2022 •

edited

Loading

gnif commented May 22, 2022

SMT pinning is broken/wrong #9

SMT pinning is broken/wrong #9

Comments

gnif commented May 21, 2022 • edited Loading

Onepamopa commented May 22, 2022

gnif commented May 22, 2022

Onepamopa commented May 22, 2022

gnif commented May 22, 2022

ayufan commented May 22, 2022 • edited Loading

gnif commented May 22, 2022 • edited Loading

ayufan commented May 22, 2022 • edited Loading

gnif commented May 22, 2022

gnif commented May 21, 2022 •

edited

Loading

ayufan commented May 22, 2022 •

edited

Loading

gnif commented May 22, 2022 •

edited

Loading

ayufan commented May 22, 2022 •

edited

Loading