QEMU with VirtIO GPU Vulkan Support

jamesu · 2024-12-11T22:38:05 1733956685

It's nice to see support for vulkan in qemu actually getting somewhere, being able to run modern accelerated workloads inside a vm (without dealing with sr-iov) is pretty cool and definitely has some use cases.

jakogut · 2024-12-12T00:08:13 1733962093

Features like this are why I prefer using QEMU directly rather than an abstraction like libvirt on top of QEMU.

Graphical interfaces like virt-manager are nice at first, but I don't need an abstraction on top of multiple hypervisors to make them all look superficially the same, because they're not. Eventually the abstraction breaks down and gets in the way.

I need the ability to use the full capability of QEMU. I'll write a shell script to manage the complexity of the arguments. At least I don't have to deal with XML, validation, and struggling with enabling the options I want that are only supported by one specific emulator, which libvirt doesn't support, because it's not common to all of the backends.

exceptione · 2024-12-12T09:04:39 1733994279

How do you deal with networks?

I like it that libvirt integrates with firewalld. libvirt via virt-manager also provides you with quick options for dns.

My fear is that this would be a lot of wrangling with qemu before I get there. I am not fond of virt-manager, the UI is clunky, but for setting up a machine it is really helpful.

dijit · 2024-12-12T12:24:11 1734006251

Depends on the kind of network you want.

Personally I'm very lazy, so I just make a virtual bridge and force QEMU to use it for everything; putting all my VMs on my local network.

I totally understand that not everyone can do this, which is why I asked the question, I'd be interested in exploring how you would prefer the network topology to look like.

Having a virtual network on a machine would mean having a dns/dhcp server (I think dnsmasq can actually do both by itself) for ease of use, but I think I could give you a 5 line bash script that could do basically what you want easily, depending on what it is you want.

The normal "internal" network topology ends up giving you an outbound NAT to the local network (to, eventually, get onto the internet) which, I personally really dislike.

exceptione · 2024-12-12T12:50:56 1734007856

> I'd be interested in exploring how you would prefer the network topology to look like.

I tried to highly restrict my virtual machine with just an allow list (works via firewalld), and at the same time allowing the vm to query the (physical) LAN for dns-sd.

Tbh, I could not get the latter to work directly. I ended up letting my host function as an dns-sd reflector.

> virtual bridge

Does that work with wlan? libvirt creates a bridge, but with or without NAT it could not let the vm participate like a normal LAN-client. I thought it was a limitation of wireless lan bridging.

dbolgheroni · 2024-12-12T13:38:23 1734010703

It's possible to create a custom network for libvirt, but you have to add a static route to in the router for the other hosts in your LAN to see the VMs.

Using virsh, you can dump the default network with net-dumpxml, which is the default bridge libvirt creates, modify it and create another network. Add the modified file with net-create (non-persistent) or net-define.

This way the VMs can participate in the LAN and, at the same time, the LAN can see your VMs. Works with wifi and doesn't depend on having workarounds for bridging wifi and ethernet. Debian has a wiki entry on how to bridge with a wireless nic [0] but I don't think it's worth the trouble.

[0] https://wiki.debian.org/BridgeNetworkConnections#Bridging_wi...

exceptione · 2024-12-12T15:56:14 1734018974

Thanks, now I remember I got stuck there because the router in question does not allow for custom routes.

But why do you duplicate the default bridge? Wouldn't adding a route in the router + default bridge be enough for this setup to work?

dbolgheroni · 2024-12-13T15:44:15 1734104655

You can just use the default bridge, but still have to add a static route in the router.

iforgotpassword · 2024-12-12T07:11:48 1733987508

I use libvirt for qemu, because I got tired of rewriting my command line every two days because the options changed yet again.

stracer · 2024-12-12T15:09:46 1734016186

Yeah, why do they change options so often. They should keep some backward compatibility, qemu is not a new project.

throwaway48476 · 2024-12-12T00:13:18 1733962398

This isn't SR-IOV which is a hardware feature for virtualizimg GPUs. The problem is the OEMs that gate this feature for enterprise products. Few people buy them so the state of the software ecosystem for virtual GPU is terrible.

mysteria · 2024-12-12T02:40:58 1733971258

Intel used to have GVT-g hardware virtualization on their integrated GPUs from Broadwell up. I haven't tried it myself but know people who used and liked it then. All good things come to an end though, and Intel scrapped it for Rocket Lake.

I would've gone and bought Intel ARC dGPUs for my Proxmox cluster if they supported hardware virtualization on their consumer line.

https://wiki.archlinux.org/title/Intel_GVT-g

SirGiggles · 2024-12-12T02:58:31 1733972311

12th gen and newer had some form of SR-IOV support in the i915 driver, but I'm not sure whether or not Intel fully upstreamed that.

Here's a project that, iirc, backported and made a DKMS for from Intel's tree: https://github.com/strongtz/i915-sriov-dkms

I also recall from that time that Intel had SR-IOV code for the iGPU (and I think their dGPUs) in the new Xe driver

jeroenhd · 2024-12-12T14:46:57 1734014817

My experience with GVT-g is that it mostly served as a kernel panic generator. A good idea, but the software experience just isn't stable enough.

throwaway48476 · 2024-12-12T22:34:25 1734042865

Software takes time to mature and if almost 0 people use the feature it never will.

0xcde4c3db · 2024-12-12T01:02:39 1733965359

You don't even necessarily get it with enterprise products; last time I checked, Nvidia requires additional CAL-type licenses installed on a "certified" server from the "Nvidia Partner Network", while AMD and Intel limit it to very specific GPU product lines targeted at VDI (i.e. virtualizing your employees' "desktops" in a server room a la X/Citrix terminals).

rafaelmn · 2024-12-11T22:23:24 1733955804

So this seems to be about enabling a Linux VM use Vulkan on a Linux host qith Vulkan support ?

shatsky · 2024-12-12T00:15:52 1733962552

This seems to be about possibility to enable Vulkan in any guest OS for which virtio-gpu guest driver will be developed. For Windows https://github.com/virtio-win/kvm-guest-drivers-windows/pull... is being developed, hopefully it will take off

crest · 2024-12-12T00:32:32 1733963552

At that point just run the code inside a chroot with a full /dev and call it good enough. No common GPU driver, firmware or hardware was designed to securely run really untrusted code from multiple tenants.

zamadatix · 2024-12-12T14:32:26 1734013946

The "Linux hosts Linux" case does seem the least interesting for that reason. I hope one day this results in actually usable acceleration of hosting a windows VM.

mappu · 2024-12-12T01:28:13 1733966893

WebGL / WebGPU are a somewhat safe subset. Or at least safe enough that Google will keep funding multi-million pwn2own bounties for Chrome with WebGL / WebGPU enabled.

sim7c00 · 2024-12-12T14:09:43 1734012583

big bounties says nothing about security.

C-x_C-f · 2024-12-12T00:20:01 1733962801

Ignorant question—how's this different from qemu-virgl? I've been using the latter (installed from homebrew) for the last few years passing --device virtio-vga.

SirGiggles · 2024-12-12T01:07:39 1733965659

Virtio-GPU Venus is similar to Virgl except it passes through Vulkan commands rather than OpenGL

xrd · 2024-12-12T02:38:28 1733971108

Does this mean you can run cuda applications inside a qemu VM? The equivalent to --gpu=all for docker but now in an isolated VM? Is this permitting sharing of the GPU inside a VM?

SirGiggles · 2024-12-12T02:43:17 1733971397

I think this would depend on Virtio-GPU Native Context which, if I recall correctly from the qemu-devel mailing list, is the next natural progression from Virtio-GPU Vulkan

Edit: Can't substantiate further, but this is what Huang Rui, the prior steward of the Venus patchset, said: https://lore.kernel.org/all/20240411102002.240536-1-dmitry.o...

Edit 2: For further clarity, Virtio-GPU Native Context would permit running the native GPU drivers (with some modifications, minimal is what I remember being claimed) inside a VM

throwaway48476 · 2024-12-12T08:05:12 1733990712

It's going to be significant slower than native performance. Same as VirGL.

doctorpangloss · 2024-12-11T23:32:56 1733959976

Does this mean graphics workloads using Vulkan can be isolated and share most GPUs securely?

stracer · 2024-12-12T15:19:07 1734016747

If malicious program has access to GPU directly or via some buggy interface, the whole system is at risk. There is no "safe" GPU virtualization like there is with CPUs.

kcb · 2024-12-12T02:40:31 1733971231

Don't think there's anything particularly secure about it.

shmerl · 2024-12-11T23:27:23 1733959643

Looking forward to KDE Plasma implementing Vulkan rendering and then it would run in qemu/kvm with GPU acceleration over Vulkan rather than OpenGL.

rescbr · 2024-12-12T02:03:18 1733968998

You can use Zink (https://docs.mesa3d.org/drivers/zink.html) to translate OpenGL to Vulkan.

I have even used it in Windows to make a legacy proprietary OpenGL application work properly with recent Windows versions + a mobile (now unsupported) AMD GPU.

shmerl · 2024-12-12T02:44:57 1733971497

I use Zink for some games that rely on OpenGL since it works better with Mangohud as a Vulkan layer. For example all games that need scummvm or dosbox.

enoeht · 2024-12-12T06:48:00 1733986080

One still needs an extra discrete vulkan gpu for it and the other for running the OS?

iforgotpassword · 2024-12-12T07:10:27 1733987427

You just need any GPU with Vulkan support in the host system, which is very likely to be the case nowadays (except maybe in servers).

nubinetwork · 2024-12-12T07:33:31 1733988811

Someone wake me up when libvirt/virt-manager supports it, because i can't get the regular virtio gpu acceleration working either... something something spice doesn't support it...

cwbriscoe · 2024-12-11T22:18:49 1733955529

Unfortunately my distro is at linux version 6.8. Looking forward to trying it out someday.

eptcyka · 2024-12-11T23:26:39 1733959599

Unfortunately, ZFS doesn't support anything stable beyond 6.6.

SirGiggles · 2024-12-12T01:09:05 1733965745

What do you mean by stable? 2.2.7 supports the 6.12 kernel if I'm not mistaken

hamandcheese · 2024-12-12T02:30:05 1733970605

Of course, 2.2.7, what was released checks notes 1 hour ago. So I think GP was correct at the time of their post.

https://github.com/openzfs/zfs/releases/tag/zfs-2.2.7

SirGiggles · 2024-12-12T02:34:20 1733970860

Then look back to 2.2.6, it supported up to 6.10. A far cry from only supporting only up to 6.6 so I'm not seeing where they were going with with their initial statement until they define what they mean by stable.

https://github.com/openzfs/zfs/releases/tag/zfs-2.2.6

Edit: changed sentence to make more sense

Edit 2: And if we are to interpret stable as in Linux LTS, then that would be 6.12 which is supported by 2.2.7 as you said

hamandcheese · 2024-12-12T03:10:05 1733973005

Linux kernel 6.10 is EOL.

Non-LTS kernels very frequently go EOL before OpenZFS supports them, or there is only a very brief window that there is support for a non-EOL kernel.

In practice, it's hard to use a non-LTS kernel with openzfs for any significant duration.

SirGiggles · 2024-12-12T03:31:37 1733974297

That's a fair point and I don't disagree. I guess my main point of contention was the implication that either a) ZFS wasn't stable on anything non-LTS or b) the Linux kernels themselves were unstable outside of a LTS.

What stable means in this case is subject to individual use cases. In my case, I don't find having to wait a bit for ZFS to catch up despite being on an EOL kernel to be catastrophic, but after having some time to think, I can see why someone would need an LTS kernel.

hamandcheese · 2024-12-12T05:11:10 1733980270

I think we are on the same page. To clarify: if your goal is to be on stable ZFS AND non-EOL Linux kernel, then LTS kernel is usually the only option. There may be windows where there are non-LTS-non-EOL kernels supported, but non-LTS kernels go EOL very quickly, so those windows are fleeting.

This impacts distributions like NixOS in particular, which have a strict policy of removing EOL kernels.

SirGiggles · 2024-12-12T05:39:37 1733981977

I wasn't aware NixOS prunes EOL kernels, thanks for letting me know; this throws a bit of wrench/damper in my personal machine plans.

hamandcheese · 2024-12-12T06:22:56 1733984576

Woah woah woah don't let me dissuade you from NixOS. I am still a happy NixOS+ZFS user, and my fingers are crossed that I'll soon get to upgrade to kernel 6.12 :)

SirGiggles · 2024-12-12T22:11:49 1734041509

No worries on that front, I expect that fun fact to be just a minor setback but I'm still pretty dead set on making my personal infrastructure declarative, reproducible, and anti-hysteresis.

prmoustache · 2024-12-12T09:23:27 1733995407

Honestly I wouldn't even try running ZFS on anything else but a distro that ship it like ubuntu or its variant or a distro with long term support like almalinux 9.

gpm · 2024-12-11T23:10:03 1733958603

Switch distros?

cwbriscoe · 2024-12-11T23:25:26 1733959526

Well 6.13 is bleeding edge, it just started it's RC cycle. I can wait until it is mainline.