> On my older Intel NUC i can play 1080p H.264 (using mpv) hw-accelerated with 15% cpu load, or software decoded with 75% cpu load
These numbers are meaningless without measuring watt-hours used for the task.
I was able to play 1080p H.264 video with hardware acceleration on a 8800 GS with an Athlon X2 5000 with about the same CPU utilization, back in 2008-2009. There was a special library (shareware) that enabled HW acceleration way before it was commonplace on integrated GPUs. Forgot what it was called, but it was Nvidia/CUDA only.
That was 12+ years ago.
Obviously GPUs have become more efficient since then, but so have the CPUs. It also matters how the video stream was encoded for efficiency. It's entirely possible that under certain options, hardware decoding's advantages are almost entirely negated.
Also there's "levels" of hardware acceleration - using CUDA (or any other shader-level acceleration) will always be less efficient than a dedicated hardware block.
And there's multiple steps in decoding a video - some steps in some codecs may fit different acceleration schemes better, so it may not be worth the hardware cost for a full pipeline decode at some point, but then later transistors are cheaper, or new hw decode techniques discovered, so more steps can be done in dedicated hardware blocks. Also those hardware blocks may have hard limits - if it can only (say) cope with 1080p60 at a certain profile level for a codec, trying to do something more than that will likely just completely skip the HW block - it's hard to do any kind of "hybrid" decode if it's not a whole pipeline step.
"HW Video Decode Acceleration" isn't a simple boolean.
These numbers are meaningless without measuring watt-hours used for the task.
I was able to play 1080p H.264 video with hardware acceleration on a 8800 GS with an Athlon X2 5000 with about the same CPU utilization, back in 2008-2009. There was a special library (shareware) that enabled HW acceleration way before it was commonplace on integrated GPUs. Forgot what it was called, but it was Nvidia/CUDA only.
That was 12+ years ago.
Obviously GPUs have become more efficient since then, but so have the CPUs. It also matters how the video stream was encoded for efficiency. It's entirely possible that under certain options, hardware decoding's advantages are almost entirely negated.