-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Team work on getting blktap in recent kernel #325
Comments
For 2/3: I think the goal is to use NBD instead of the block device, so we do not require a kernel patch anymore. |
I'm not thinking about 4.19 but more recent kernel (cf the title of the issue). Something "future proof". So far, it's really hard to merge current blktap code in 5.x kernels. Regarding 2 and 3, if you use NBD, what about VHD format? Is it mutually exclusive with it? So in short, you'll use NBD directly between the guest and dom0? |
No, we'll use NBD from qemu to tapdisk without going via /dev/tdX (symlinked to /dev/sm/backend/XX) which will be deprecated and removed as the patch in the kernel to provide those devices cannot be forward ported to any 5.x or later kernel. Tapdisk will still write the data to VHD files/volumes. |
We know that the current blktap patch will not be accepted into the upstream kernel, it's been asked previously, so the correct way forward is to not use it at all. |
The nbd refit is almost complete with only a few bugs to fix in the toolstack. Regarding the use of io_uring, If was hoping to extend the work done in ba79b73. But if you have a similar plugin architecture that's fine. We need some thought about howto fully systemd tapdisk. |
We'll be happy to help! Do you have any preliminary perfs results from switching from blktap to NBD? (if I'm correct with the way to make a recap of what you say here). We can also help on the testing side of things, as long we know what branches to build together to generate RPM packages and start to do the bench work (we have some automated scripts to do nice About your systemd questions, feel free to share your questions/blockers, more brains could mean faster thinking sometimes! |
Hi everyone,
I'm not sure to understand. Is this diagram up to date? What's the usage of qemu here? If I understand correctly the goal is to use a qemu instance in the host user space to write/read from the VM disk using blkback instead of the blktap patch or another driver? The data is then forwarded to tapdisk using NBD, isn'it? So to use a VM disk, we must have a qemu process + tapdisk to just read/write using NBD. Why not use qemu directly without tapdisk, like qemu-dp in SMAPIv3 but with VHD support? Thank you in advance for your clarification! |
HVM guests have a qemu process which handles the "emulated" devices required at boot. This is the primary thing requiring the /dev/blktap/blktapX devices. Once the PV drivers have loaded in guest all disk IO is performed via the PV ring direct to the tapdisk user space process. The diagram above is incorrect and shows blkfrnt communicating with in-kernel blktap. This doesn't happen, and hasn't been the case since blktap3. As the qemu "device model" process is also capable of taking its boot disk parameter as an NBD url (as is used in the case of the Citrix GFS2 SR as it doesn't provide a Dom0 block device interface) and the userspace tapdisk process also has an NBD server in it we can change the consumers of the block device to use NBD and thus no longer require the unsupportable in-kernel device, at which point the kernel patch can be dropped entirely. This will need to happen before moving to a 5.x kernel. |
Oh! I see, it's more clear now! I had trouble understanding the architecture, so you were talking about qemu-dm! 🙂 So NBD or the blktap patch is only used for HVM or PVM during the boot phase. Thank you so much! |
Boot or before PV drivers are installed in the case of Window, yes. |
I ported blktap2 to 5.4: OpenXT uses tap-ctl to extract kernels and hash disk images during pre-boot. Having said that, I wish for it to go away. Will tapdisk work today without the blktap2 module? |
@olivierlambert do you have a pointer to your blktap2 patch? I had to fix some bugs, so I'm interested to compare. OpenXT is tightly bound to tap-ctl and blktap2, so it's not something I am looking to tackle any time soon, sorry. |
It won't currently work without as all the |
@Wescoeur do you remember where we stored your draft work on getting recent kernel working? @MarkSymsCtx the best course will be obviously (for XCP-ng and likely Citrix Hypervisor) to get rid entirely on this legacy stuff. As we said earlier, we'll be happy to spend some time and efforts on this 👍 |
A large chunk of it is sort of done, basically replacing use of /dev/tdX in Dom0 with NBD to the NBD socket provided by the server in Tapdisk and this completes end to end testing. But the next bits are to update what goes into XenStore as backend configuration keys, hint |
@olivierlambert @jandryuk We have it in an internal gitlab. It's a quick and dirty poc but I can create a public repo. @MarkSymsCtx Thank you for the details! 👍 |
Hi @MarkSymsCtx, We are very interested in speeding up the removal of this patch and contributing on this repo. If I don't write the "physical-device" property I can start a VM using a nbd+unix URI with the socket path (I already patched my local smapi to do that), but I don’t understand where the change between emulation and PV driver occurs. What piece of code is in charge of this please? Is there a piece of code to modify in tapdisk or elsewhere to finalize this switch? |
@MarkSymsCtx Do you have more info please? 🙂 |
Not at this time, no I don't |
Hi there!
We started to think about various changes on blktap, especially to achieve these goals:
Obviously, doing this together and going into the right direction after some architecture decision might be faster for everyone, deliver better perfs for both Citrix Hypervisor and XCP-ng. What do you think? Where you'd like to start?
The text was updated successfully, but these errors were encountered: