Vulkan Video Decoding

jchw · on May 8, 2023

I haven't really done a whole lot with Vulkan yet but this really makes me curious if Vulkan could ever find itself as an alternative to VAAPI on Linux in some cases. Vulkan for compute purposes already seems like a pretty compelling idea, as it is, but in both cases I wonder, but was unable to ascertain an answer with a cursory glance, if it is possible today or maybe will ever be possible, to expose devices to Vulkan that are not GPUs (like compute accelerators.) Perhaps it's not even really a good idea :) The case is probably stronger for compute devices a la OpenCL, since at least for video, if you're targeting Linux anyways, it seems like you could use dma-bufs to pass buffers between Vulkan and VA-API and other stuff. Still, it does inspire curiosity.

A cursory search did seem to suggest encoding was supported as well as decoding in Vulkan, which is very cool.

cillian64 · on May 8, 2023

On Linux there is already an alternative to VAAPI: The Video4Linux Memory-2-memory (V4L2-M2M) API, which provides a standardised API for hardware encode and decode accelerators (and supports using DMA-bufs for copy-free decoding or using decoded frames as textures). I don't know how much support there is for AMD/nvidia/intel but it works well on the Raspberry Pi at least.

It would feel weird for Vulkan to become the standard API for encode/decode given that (as you say) video accelerators aren't necessarily tied to the GPU (although not necessarily weirder than doing it through an extension of the webcam API).

jchw · on May 8, 2023

Truth told, Raspberry Pi is the only device I've ever used v4l2 to do video encoding with. It does seem to work well enough, but since I was using it via GStreamer, I have no idea what the API looks like. The advantage of Vulkan is that they've already got a functional and productive committee with all of the GPU vendor stakeholders to work out the design, whereas VA-API and VDPAU seem to have somewhat different designs and seem to be mostly championed by a single GPU vendor. I have a bit more confidence that the Vulkan designs for hardware accelerated video will be good for both hardware vendors and application developers (even if it is still low-level.)

In a neighboring comment thread, hrydgard states that it is very much possible to have a Vulkan driver that has only video encoding/decoding queues, which if true resolves the potential awkwardness in my opinion. That's especially relieving if true as well since Vulkan looks to be a prime contender to take up the mantle for OpenCL as well; especially believable now seeing what has been done with rusticl.

Perhaps it is still a little awkward, but Vulkan as a general hardware acceleration abstraction layer seems like it could be a very good thing. As it is, as an application developer, you're probably better off on Linux just using GStreamer or Pipewire rather than v4l2 directly, since some devices (e.g. Blackmagic Decklink) provide GStreamer integration but not v4l2 drivers. In that case, it hardly matters what API abstracts the hardware as long as it does a good job of it.

shmerl · on May 8, 2023

Yes, the general idea is for it to replace VAAPI for all relevant use cases.

I'd expect it also to move faster than VAAPI moves now once all things are in place (like adding new hardware and etc.).

It might be also possible to implement actual VAAPI over Vulkan video.

jchw · on May 8, 2023

Hmmm, interesting: maybe the last bit is the key to working around the issue that Vulkan drivers can really only exist for GPUs (to my knowledge, anyways). You could still have VA-API drivers for devices other than APIs, alongside a Vulkan implementation.

hrydgard · on May 8, 2023

You can have a Vulkan driver that only exposes a video decoder queue and no raster/compute/copy queues, not a problem.

jchw · on May 8, 2023

ahh, that makes sense. Very nice. Thanks for clarifying.

bubblethink · on May 8, 2023

> if Vulkan could ever find itself as an alternative to VAAPI

The bar to beat VAAPI is really low. For all practical purposes, VAAPI doesn't exist. So any alternate plan, even if it is essentially the same thing underneath, is worth a shot.

redeeman · on May 8, 2023

except all the practical usage that a lot of linux users have. using your logic, linux on the desktop doesnt exist, and yet it does. Yes, much fewer use it than windows, but those who use it do still use it

pjmlp · on May 8, 2023

My single desktop experience is that no matter what I do with VAAPI on my netbook (the only device left with GNU/Linux), I only get software rendering on Chrome/FF.

bubblethink · on May 9, 2023

I meant that vaapi is broken beyond repair and a lost cause.

dagmx · on May 8, 2023

This is great. It brings it in line with DirectX ( https://learn.microsoft.com/en-us/windows/win32/medfound/dir... )

It’s quite handy to be able to render a video on to spatial objects

Const-me · on May 8, 2023

The API you have linked is too low level to my taste.

When I wanted to play video and render them inside D3D12 scenes, I have used a higher-level one, IMFMediaEngine https://learn.microsoft.com/en-us/windows/win32/api/mfmediae...

That OS-supplied high-level video player object can demux containers, decode, play audio, and copy uncompressed frames to D3D11 textures. Then, it’s relatively easy to use DXGI surface sharing API to share the resulting RGBA8 frames from D3D11 into D3D12.

vlovich123 · on May 8, 2023

Yeah. I was trying to do something like render a scene and encode it on-GPU before sending it to disk. Unfortunately, there’s no way to give FFMPEG video textures afaict so even if it uses hw decode, you’re copying the full pixel buffer data twice instead of the drastically smaller encoded frame once (or at least there’s no mechanism exposed that the JS ecosystem can make use of which relies on command-line ffmpeg)

averne_ · on May 8, 2023

There are OpenGL extensions which can import a provided GPU buffer as a texture, using those you can achieve zero-copy.

For instance, with VAAPI->OpenGL you would use vaExportSurfaceHandle in conjunction with glEGLImageTargetTexture2DOES.

Check out the "hwdec" mechanism in MPV:

https://github.com/mpv-player/mpv/blob/master/video/out/hwde...

vlovich123 · on May 8, 2023

Sure. I think the part that’s missing is that FFMPEG runs out of process and doesn’t deal in GPU buffers/textures.

cillian64 · on May 8, 2023

If you use ffmpeg’s libavcodec interface then you can get it to give you the decoded framebuffers as exported DRM-prime descriptors which you can turn into textures. This is how Firefox does video decode with VAAPI, using libavcodec as a wrapper.

Edit: missed the part about JS ecosystem. You can move DRM prime descriptors between processes, but I assume you can’t do this from the ffmpeg CLI and would need to write your own little C wrapper around libavcodec

goeiedaggoeie · on May 8, 2023

As someone else said you can achieve this with openGL, I also think gstreamer can do this.

djmips · on May 8, 2023

I had a professional project I wanted to do in Linux but ended up going with Windows because of the 3D / video performance.

chme · on May 8, 2023

> Take a look at FFMPEG, the most popular and feature complete video library out there: it’s huge, consisting of several DLL files and lots of dependencies.

What? You can compile it with just the stuff you need and only activate stuff without any dependencies. Static linking is also possible if too many 'DLL' files is a issue.

But vulkan video decoding is still cool.

_gabe_ · on May 8, 2023

> What? You can compile it with just the stuff you need and only activate stuff without any dependencies.

Sure, but running config alone takes like 10-15 minutes on my PC with an Intel I7-9700K processor at 4.9GHz. This is while using all my cores at 100% CPU usage too, it's ridiculous. And not only that, you'll likely have to run this massive script several times to tweak the config until you're happy with what you're using. Then actually compiling the library statically takes another 15-20 minutes. So I think the remark that it's huge and consisting of several DLL files is spot on.

> Static linking is also possible if too many 'DLL' files is a issue.

FFmpeg doesn't recommend static linking.

> The following is a checklist for LGPL compliance when linking against the FFmpeg libraries. It is not the only way to comply with the license, but we think it is the easiest...Use dynamic linking (on windows, this means linking to dlls) for linking with FFmpeg libraries.[0]

[0]: https://www.ffmpeg.org/legal.html

izacus · on May 8, 2023

It's kind of a strange comparison... FFMPeg is a library for everything and anything video, audio and image transcoding.

The API this post writes about is basically a single decoder... without even (de)muxing ability.

CrampusDestrus · on May 8, 2023

Correct me if I'm wrong: when we decode a video we either use software decoding via the cpu or we let a specialized hardware module (usually inside a gpu) handle the decoding in an efficient way. Using Vulkan to decode video basically means writing a decoder in a special way as to accelerate it using the gpu instead of the cpu, which makes it more efficient and faster than cpu decoding and allows older gpus to decode newer formats. Of course it will always be slower and less efficient than the hardware accelerated module but it is more flexible and can actually be updated

TazeTSchnitzel · on May 8, 2023

Vulkan is primarily a GPU API, but it relatively recently gained a hardware-accelerated video API called Vulkan Video, which is the topic of this post. That it's part of Vulkan is perhaps because video decoders are often bundled with GPUs, though that's not always the case and I believe Vulkan Video can be implemented also where it's separate from the GPU (common on mobile).

galad87 · on May 8, 2023

No, it's only another way to access the same hardware decoder.

cubefox · on May 8, 2023

"built-in fixed function video unit", "hardware decoder" -- that just means the decoder of each supported codec is physically present as an ASIC, right?

galad87 · on May 8, 2023

Right.

meindnoch · on May 8, 2023

>Correct me if I'm wrong

You're wrong. It's literally the first sentence of the article.

CraigJPerry · on May 8, 2023

> You're wrong. It's literally the first sentence of the article.

To me, this just comes across as pointlessly condescending. There's faster and more direct ways to say the same thing, e.g.

> Yeah the first sentence says different

CrampusDestrus · on May 8, 2023

it's "literally" the first sentence if you actually know what it means

shmerl · on May 8, 2023

Looking forward to it replacing VAAPI, including for AV1.

redeeman · on May 8, 2023

shmerl · on May 8, 2023

Pace of development, not depending on Intel to kick things off first and etc.

goeiedaggoeie · on May 8, 2023

probably because vulkan is portable

sylware · on May 8, 2023

Modern hardware wants dma buffers and command ring buffers to keep the hardware programming interface simple.

I saw AMD video AV1 encoding/decoding going into radv mesa (probably due to the new AMD AV1 accelerators).

The interface must stay excrutiating simple (plain and simple C), have no generators, etc...

kevingadd · on May 8, 2023

I wonder if this will be a path towards crashless video decoding in web browsers*. Probably not though :(

* modern web browsers have incredibly complex behavior to try and detect whether your system's video codecs have crashed, since they can and will, and then if they crash too many times the system decoder is disabled and either replaced with a software decoder or nothing at all

dagmx · on May 8, 2023

Hardware decoders tend to be less resilient than software decoders, since they’re tuned for the most common case but fail at more esoteric scenarios.

So I doubt this would change anything to that effect.