The missing phrase in the post is "hardware acceleration", which is is completely broken for VP8/9(In Firefox & Chrome, IE&Safari doesn't support either).
I'm using an extension[0] to force youtube to serve me with h264 in order to be able to watch 4K videos & maintain reasonable battery life.
Yea, but the flip side of that is that hardware acceleration is a chicken-and-egg problem: Google's work here makes it more likely that we'll get hardware acceleration working in the future because there will now be a good reason to do it.
(They do mention "More than 20 device partners across the industry are launching products in 2015 and beyond using VP9.", so perhaps some of those will feature working hardware acceleration.)
Both Intel (integrated GPUs) and nVidia (discrete GPUs) have announced VP9 hw acceleration for their GPUs. However, I believe these are probably the only two "household name" ODMs that are supporting it. The rest of the partners are cheap SOC manufacturers like Rockchip and Mediatek
FWIW, everyone and their dog (nVidia, AMD, ARM, Qualcomm, etc) announced support for VP8 when WebM launched in 2010 too (http://blog.webmproject.org/2010/05/introducing-webm-open-we...). Yet, hardware accelerated decode is basically non-existant still, and accelerated encoding doesn't exist at all. Even though lots of companies did demos of hardware acceleration.
Part of the problem is that within 3 years Google had already basically dropped VP8 and moved onto VP9. That kind of churn doesn't go over well in the hardware space typically. For a variety of practical reasons, codecs aren't the type of thing you usually replace every few years. It takes more than a few years for really good quality optimized encoders (e.g. x264, LAME, etc) to be built.
This why everyone should be skeptical about Google's foray into mobile video codecs. They have no "skin in the game". This isn't their core business like companies such as Sony or Adobe. And we all know what happens to Google's non core business products.
Just like they had no skin in the game with SPDY? The vast majority of their revenue stream comes from the web platform, and video is becoming more important to the success of any development platform as bandwidth gets cheaper. Unlike the other companies, they don't care about getting paid for the video technology itself, which would only slow down video adoption on the web. Unlike Apple and Microsoft (both perennial laggards on html5test.com), which have alternate platforms to push, Google does not want to slow down adoption of video on the web.
It is a fact that the longer a page takes to render affects the number of users who stay on that page. That would be pretty damn important for a company who makes most of their revenue from web advertising. Which is why Google will continue to push ANY technology e.g. SPDY that makes web pages faster.
VP9 does not fit in with their goal of getting more eyeballs on ads which is what makes it odd for them to push. There is no evidence what so ever of H.264 has resulted or will continue to result in a lack of adoption of video on the web. All evidence is to the contrary. In fact it is the strength of YouTube that has resulted in a lack of diversity in video sites.
If VP9 is just 5% better, bandwidth saving itself for YouTube will be substantial, also if ads do not buffer, they'll be more acceptable than video ads that buffer.
> If VP9 is just 5% better, bandwidth saving itself for YouTube will be substantial
You should look at this from a business perspective: how substantial would those savings need to be to justify a billion-dollar scale investment?
VP9 codec development alone has been expensive and that's not even including the significant hardware engineering costs needed to make it competitive with the MPEG group's standards on all but the highest-end desktop computers.
YouTube makes money from people watching video, not from sales of any particular video codec. They're already using H.264 for everything and will need to add H.265 support for the same reason.
So if we look at this from the perspective of a business manager:
1. Make a significant investment making VP9 appealing enough to produce the widespread adoption needed to see significant bandwidth savings.
2. Use the same MPEG codecs for all visitors – dropping support for interoperability with Apple, Microsoft, etc. isn't likely so they're going to need an H.265 path either way. This has no significant upfront costs because they can use the same tools as everyone else and get the same bandwidth savings.
From that perspective, the question is really whether the HEVC license fees are going to be higher than the cost of funding VP9. There's some intangible value in having an alternative if the MPEG group gets greedy and it might help them negotiate more favorable rates, too, but I'm pretty sure none of that is enough to confidently say that Google's senior management won't consider cutting it the next time they need the right news for Wall Street.
On the contrary, Google spends a lot of resources to make web faster, research shown that faster web sites creates better user engagement and satisfaction, hence better revenue.
The question you need to ask is how much a video codec factors into Android's success. What would happen if they dropped VP8/VP9 entirely? Every single video on YouTube, Netflix, Amazon, etc. would still play. Nobody would encounter problems with video chat, sharing, etc. because those apps all require H.264 unless they only support Android users with high-end CPUs.
At the end of the day, VP8/VP9 is interesting as a possible bandwidth optimization or share certain patent/free software positions. It's just hard to believe that this would decide even a single phone purchase, much less enough to significantly affect Android development priorities.
Do they actually have a good way to monetize android - are they selling phones? Last I checked they had a great search engine. How does the revenue model on mobile compare? Android is different on every oem so though it might be huge - is it really even a google product or more of a Google open source project?
Do they actually have a good way to monetize android
As far as I understand, the initial motivation is to avoid being locked out. Suppose that iOS held 80% of the market, they could change the default engine, replace Google Maps with Apple Maps (like they did), etc. Given that smartphones and tablets are now commonly used to access the web, Google could have lost a serious chunk of traffic.
In other words, Google does not have to monetize Android directly, it exists to keep people in Google's ecosystem.
That makes no sense. Android is no different to Windows or OSX. It is irrelevant which codec it uses. It doesn't benefit the platform either way to use VP9, MPEG2, H.265 or any other codec.
Sure there are royalty costs for using H.265. But one could argue that (a) Android already costs money due to Microsoft licensing and (b) there are likely patent issues with VP9 just like there was with VP8.
Sure. But you also have a lot of factors working against you.
Basically no one but Google is particularly interested in VP9. Most of the key players in the IT industry are part of the H.26x camp. And when you have Apple who dominates mobile browsing not interested then content creators won't be interested. This flows onto component suppliers.
Nonsense. You can't really believe this. Apple only needs one video chip set provider. All the other suppliers need to get a piece of the Android pie, which requires them to support VP9.
This is amazing! Finally I'm able to watch YouTube without my Macbook initiating a take off sequence when something is available in resolutions over 1080p.
On OSX there is a noticeable difference in performance of video playback in Safari and Chrome, and I'm pretty sure it's all down to hardware accelerated decoding.
They've got hardware accelerated CSS3 working right in the latest versions, but VP8/VP9 won't work. The explanation is that Safari doesn't support VP8/VP9, so a fallback to h264 happens and the hardware accelerates that properly.
I've noticed hardware accelerated CSS3, much to my despair, as it triggers the discrete gpu. It makes everything freeze for a split second, draw the page, then switch back to the integrated gpu, again with a slight freeze. And if I'm really lucky, the whole process of switching gpus will just crash resulting in a forced reboot.
But I can't really blame Chrome for something that is out of their control. I recommend checking out gfx.io, that way you know what gpu is active. You can also lock it to internal only if you don't use external displays or anything that really requires the discrete gpu.
Yeah. In general, this is a big problem, not only for VP8/9, but also for h265 (also known as HVEC). My computer struggles to render 1080p HVEC video lag-free. Early adopters should recognize that there are advantages to using h264 that have nothing to do with the quality of the codec itself.
Minor correction: H.265 is HEVC - High Efficiency Video Coding.
H.265 and VP9 will (most likely) have a lot of difference when we consider hardware encoders/decoders. I guess H.265 hardware decoders will be present on all platforms in the next couple of years. There are mobile processors that have hardware H.265 decoder. H.265 encoder seems to be a bit far. I am not sure whether someone has a production ready hardware encoder. On the other hand, VP9 is not a priority for most of the companies.
YouTube is only going to offer UHD over VP9, meaning that 4K smart TVs and set-top-boxes will have to support VP9. Google can also use their leverage over Android to influence mobile device makers. At $0.40 a pop, it makes sense for most SoCs to support it.
Netflix has said they are going to use H.265, but they could adopt the same strategy as Google. They could even force their desktop customers to install the VP9 codec, just as they did for Silverlight.
The primary problem is Apple, they simply won't support it. Thankfully, AppleTV hasn't caught fire, relegating their control of the market to the iPhone.
Daala will be more amenable to acceleration via generic GPUs. It probably won't match dedicated hardware but if a mobile device can decode it and the bandwidth savings are significant, the lack of licensing fees will make it a very attractive option.
Hopefully Daala will be significantly better than H.265 and win over Apple and others based on the merit of their codec alone.
I'm not sure the TV industry is about to invest in supporting a technology that will probably be long deprecated by the time most of their customers will even be able to use it. Fool me once etc
By "TV industry" I assume you mean the manufacturers of TVs. Since most of them support Android TV (which requires VP9), or want 4k Youtube as a ready source of content for their 4k TVs, and are often brands that already make Android phones, or re-use chips intended for those that do, I'd guess they'd have to do more work to avoid VP9 then to use it.
Daala is way more exciting technically as well. Of course, being more exciting is not equivalent to being better, but with such a novel approach (lapped transform), Daala may just shake up the state of the art, while the rest merely refine it.
>Why would anyone want more than 1080p on a phone?
Because 1080p on a 5 - 6 inch device creates visible pixels to the naked eye, and can be improved upon with a higher quality screen.
I have a 1440p 5.5" smartphone and the difference next to 720p is staggering and the difference next to 1080p is still noticeable to the untrained eye. The tests I use to demonstrate to people include well formed text display, comic-book display, and Unreal4 demo. People pick out the 1440p screen as best without much issue in every test.
I get that > 1080p makes sense for text and vector graphics. But really, what are you realistically going to watch on your phone that's been filmed with a 4K camera and optics that match that resolution? The fact that phones are shipping with 4k video capability does not mean the quality is better than the same camera shooting 1080p, especially when you take into account the limit on bandwidth in the encoder chip, so 1080p can be recorded at a higher bitrate.
I remember with previous size jumps, it gets to a certain point when you want to be able to decode 4k video, even if your display (or eyeballs) can't handle it, just because that's easier than transcoding the original file.
I'm not entirely convinced that the minor benefits from increasing resolution so much offset the cost in terms of battery life, especially on devices where screens are already the most power-hungry parts.
>But really, what are you realistically going to watch on your phone that's been filmed with a 4K camera and optics that match that resolution?
You seem to be avoiding the fact that the primary use case of smartphones includes images and text, not video.
You're right that video of sufficiently high enough quality to notice isn't readily available -- but who cares?
1440p makes the text under an app icon easier to read.
It makes webpages easier to read.
It makes "online magazines" crisper. It takes better advantage of a plethora of high resolution iconography and imagery designed to take advantage of "retina" this and "4k" that.
Sure, it maybe a decade before we're streaming >1440p video on our devices, but higher resolution screens making better text was a need ten years ago, not just today.
>>Why would anyone want more than 1080p on a phone?
>Did you miss the fact that we are discussing a video codec?
I apologize that you cannot follow basic thread context. I have provided the question that I answered for you so you can understand that the context of this thread wasn't artificially limited as you suggest -- (the question wasn't "with regards to video content only, why would anyone want >1080p"...)
Furthermore, I broadened the context explicitly by listing my 3 different tests (including video) that I based my answer off of. If you didn't want to use this context, you should not have replied to me, because I found these tests relevant to the larger question of why >1080p is useful and will become standard.
I've only held a galaxy note (2560x1440) once but it was pretty nice. Resolution is one specs race that I've always been fond of. When somebody finally finds a sasquatch you'll be glad for your 4k display
For the same reason that we want more than 640k RAM [1]? More seriously: Economics of cellphone screens are driving prices and specs of heads-up VR displays. Higher resolution and faster rendering help both -- plus benefits of mass-production.
> Why would anyone want more than 1080p on a phone?
The same reason that flagship phones now tend to have screens with resolution greater than 1920x1080. 1920x1080 isn't the highest useful resolution at the size of many of today's phones.
This has nothing to do with merits and everything to do with big business politics. That said in the case the best quality codec ie. H.265 is likely to win out pretty comfortably.
I think many thought this a year ago, but it's not the case any more. All of the major SoC vendors are shipping VP9 in their newest system-on-chips. In addition, a new licensing pool for HEVC just formed, and has yet to announce their licensing model or cost [1]. This is all the more reason for companies to look for alternatives.
Anime groups are playing with it. Anime fansub groups, in general, are hungry for the absolute best in codec technology. Hardware and media player compatibility be damned. An example: their use of 10-bit color.
There are some groups in the anime scene whose sole purpose is to take releases from other groups and convert them into standard h.264 video that can play on basically any device.
I'm a colorist. I spend all day looking at color intensely with very expensive monitors. It makes me really excited to see people who actually care about color reproduction over things like resolution.
With that said, I have to ask why these groups are interested in 10-bit when I'm essentially certain they cannot view in 10-bit. Only workstation GPU's (Quadro and FirePro) output 10-bit (consumer GPU's are intentionally crippled to 8-bit) and I can't really think of any monitors that have 10-bit panels under about $1000 (though there are many with 10-bit processing which is nice but doesn't get you to 10-bit monitoring). There are some output boxes intended for video production that can get around the GPU problem, but by the time you've got a full 10-bit environment, you're at $1500 bare minimum which seems excessive for most consumer consumption.
So I guess what I'm asking, are these groups interested in having 10-bit because it's better and more desirable (and a placebo quality effect) or are they actively watching these in full 10-bit environments?
It's worse than that; the input itself is only 8-bit and they scale up, and then back down again to 8-bit on the output side.
But it works, for the same reason that audio engineers use 64-bit signal pipelines inside their DAW even though nearly all output equipment (and much input equipment) is 16-bit which is already at the limit of human perception.
If you have a 16-bit signal path, then every device on the signal path gets 16 bits of input and 16-bits of output. So every device in the path rounds to the nearest 16-bit value, which has an error of +/- 0.5 per device.
However if you do a lot of those back to back they accumulate. If you have 32 steps in your signal path and each is +/- 0.5 then the total is +/- 16. Ideally some of them will cancel out, but in the worst case it's actually off by 16. "off by 16" is equivalent to "off by lg2(16)=4 bits". So now you don't have 16-bit audio, you have 12-bit audio, because 4 of the bits are junk. And 12-bit audio is no longer outside the limit of human perception.
Instead if you do all the math at 64-bit you still have 4 bits of error but they're way over in bits 60-64 where nobody can ever hear them. Then you chop down to 16-bit at the very end and the quality is better. You can have a suuuuuuuper long signal path that accumulates 16 or 32 or 48 bits of error and nobody notices because you still have 16 good bits.
tl;dr rounding errors accumulate inside the encoder
> 16-bit which is already at the limit of human perception
Nitpick: 16-bit fixed point is not at the limit of human perception. It's close, but I think 18-bit is required for fixed point. Floating point is a different issue.
> If you have 32 steps in your signal path and each is +/- 0.5 then the total is +/- 16.
Uncorrelated error doesn't accumulate like that. It accumulates as RSS (root of sum of squares). So, sqrt(32 * (.5 * .5)) which is about 2.82 (about 1-2 bits).
> You can have a suuuuuuuper long signal path that accumulates 16 or 32 or 48 bits of error and nobody notices because you still have 16 good bits.
Generally the thing which causes audible errors are effects like reverb, delay, phasors, compressors, etc. These are non-linear effects and consequently error can multiply and wind up in the audible range. Because error accumulates as RSS, it's really hard to get error to additively appear in the audible range.
tl;dr recording engineers like to play with non-linear effects which can eat up all your bits
> > 16-bit which is already at the limit of human perception
> Nitpick: 16-bit fixed point is not at the limit of human perception. It's close, but I think 18-bit is required for fixed point. Floating point is a different issue.
16-bit with shaped dither should be good enough to cover human perception.
This is the same reason try and keep post production workflows at 10bit or better (my programs are all 32bit floating point). A lot of cameras are capable of 16bit for internal processing but are limited to 8 or 10bit for encoding (outside some raw solutions). An ideal workflow is that raw codec (though it's often a 10bit file instead of raw) going straight to color (me working at 32) and then I deliver at 10bit from which 8bit final delivery files (outside of theatrical releases which work off 16bit files and incidentally use 24bit for the audio) are generated. So all that makes sense to me.
I was mostly curious why people were converting what I assume are 8bit files into 10bit. The responses below about the bandwidth savings and/or quality increase on that final compressed version seem to be what I missing!
> With that said, I have to ask why these groups are interested in 10-bit when I'm essentially certain they cannot view in 10-bit.
The H.264 prediction loop includes some inherent rounding biases and noise injection. Using 10-bit instead of 8-bit reduces the injected noise by 12 dB, which makes a noticeable difference in quality-per-(encoded)-bit, even when the source and display are both 8-bit. I spent some time talking with Google engineers early on in the VP9 design process, and they commented that they had done some experiments, but did not see similar gains by using higher bit depths with VP9. I don't know if that's still true with the final VP9 design, though.
I am far from an expert on the matter, but i recall the case was made that acceptable color reproduction could be achieved with lower bitrates by using 10-bit encoding. It was about file size, i don't think anyone in the fansubing community expected people to have professional monitors.
>With that said, I have to ask why these groups are interested in 10-bit
I can explain that - I wrote an extensive post on this matter a year back, so I'll just reuse that with a bit of tweaking. So with that clear, let's talk about the medium we're working with a bit first.
Banding is the most common issue with anime. Smooth color surfaces are aplenty, and consumer products (DVDs/BDs) made by "professionals" have a long history of terrible mastering (and then there's companies like QTEC that take terrible mastering to eleven with their ridiculous overfiltering). As such, the fansubbing scene has a long history with video processing in an effort to increase the perceived quality by fixing the various source issues.
This naturally includes debanding. However, due to the large smooth color surfaces, you pretty much always need to use dithering in order to have truly smooth-looking gradients in 8-bit. And since dithering is essentially noise to the encoder, preserving fine dither and not having the H.264 encoder introduce additional banding at the encoding stage meant that you'd have to throw a lot of extra bitrate at it. But we're talking about digital download end products here, with bitrates usually varying between 1-4 Mbps for TV 720p stuff and 2-12 Mbps for BD 720p/1080p stuff, not encodes for Blu-ray discs where the video bitrate is around 30-40 Mbps.
Because of the whole "digital download end products" thing, banding was still the most common issue with anime encodes back when everyone was doing 8-bit video, and people did a whole bunch of tricks to try to minimize it, like overlaying masked static grain on top of the video (a trick I used to use myself, and incidentally is something I've later seen used in professional BDs as well - though they seem to have forgot to properly deband it first). These tricks worked to a degree, but usually came with a cost in picture quality (not everyone liked the look of the overlaid static grain, for example). Alternatively, the videos just had banding, and that was it.
Over the years, our video processing tools got increasingly sophisticated. Nowadays the most used debanding solutions all work in 16-bit, and you can do a whole bunch of other filtering in 16-bit too. Which is nice and all, but ultimately, you had to dither it down to 8-bit and encode it, at which point you ran into the issue of gradient preservation once again.
Enter 10-bit encoding: With the extra two bits per channel, encoding smooth gradients suddenly got a lot easier. You could pass the 16-bit debanded video to the encoder and get nice and smooth gradients at much lower bitrates than what you'd need to have smooth dithered gradients with 8-bit. With the increased precision, truncation errors are also reduced and compression efficiency is increased (despite the extra two bits), so ultimately, if you're encoding at the same bitrate and settings using 8-bit and 10-bit, the latter will give you smoother gradients and more detail, and you don't really need to do any kind of processing tricks to preserve gradients anymore. Which is pretty great!
Now, obviously most people don't have 10-bit screens or so, so dithering the video down to 8-bit is still required at some point. However, with 10-bit, this job is moved from the encoder to the end-user, which is a much nicer scenario, since you don't need to throw a ton of bitrate for preserving the dithering in the actual encode anymore. The end result is that the video looks like such an encode on a 8-bit (or lower) screen, but without the whole "ton of bitrate" actually being required.
So the bottom line is that even with 8-bit sources and 8-bit (or lower) consumer displays, 10-bit encoding provides notable benefits, especially for anime. And since anime encoders generally don't give a toss about hardware decoder compatibility (because hardware players are generally terrible with the advanced subtitles that fansubbers have used for a long time), there really was no reason not to switch.
Most consumer sources are also-limited to 8-bit precision. 10-bit encodes from these are made using a fancy temporal debanding filter, which only works in certain cases.
Keep in mind that there is a range expansion when converting to monitor sRGB. Also several video players now support dithering. So the debanding effect can often be quite visible.
Also, x264 benefits from less noise in its references, which means 10-bit gets a small compression performance bump.
Didn't know anime enthusiasts have an interest in 10-bit video, but I've always thought it's a great shame most consumer camera codecs are limited to 8 bits.
I have been waiting for h265 to surface in consumer cameras, and hopefully at 10 and 12 bit depths as options. Even if current monitors don't do 10 bits, there is so much information you can pull out of the extra data.
Many camera sensor chips can output 10 or 12 bits, it's a shame it doesn't get recorded on most cameras.
Hopefully Rec2020 on TV's and new blu ray formats will also push cameras.
Well, this was not the first time they adopted something early and left most of the world behind: widespread adaptation of the ogm then the mkv containers, followed by the complete swap to h264 over XviD, and I believe the abandoning of "must fit either onto a CD or DVD exactly" ripping and encoding rule was also a first.
> I'm essentially certain they cannot view in 10-bit. Only workstation GPU's (Quadro and FirePro) output 10-bit (consumer GPU's are intentionally crippled to 8-bit)
That's not what NVidia says:
"NVIDIA Geforce graphics cards have offered 10-bit per color out to a full screen Direct X surface since the Geforce 200 series GPUs"
I can't find any details on the Crossover, but the Achieva is an 8bit panel with FRC to attain effective 10-bit. It's not true 10 and I wouldn't recommend using it in a professional setting (benchmark being a true 10bit panel with usually 12 bit processing such as Flanders Scientific http://www.flandersscientific.com/index/cm171.php).
And while NVIDIA does output 10bit on GeForce, it's limited to
> full screen Direct X surface
which means no OpenGL (a must with pro apps). I suppose it might be possible to use a DirectX decoder in your video player to output 10bit video but I haven't heard anyone doing it or tried myself.
All that said, I was originally talking about consumer facing 10bit, so the monitors are probably a valid point. As someone who cares about color reproduction, I hope to see more of that - especially as we move towards rec2020.
> This will only disable webm VP8 + VP9, forcing FF to fallback to H264.
I'm glad you edited your comment but I'm still not sure what you disagree about. SoapSeller was talking about hardware acceleration and in that context disabling WebM globally will provide better performance/battery-life.
The only thing I was contributing to the discussion was the note that on Firefox this does not require an extension.
Correct. If your goal is the improved performance and battery-life, that's probably acceptable but it's definitely something to think about if you frequent sites which don't support H.264.
It's a tradeoff between bandwidth and CPU. That's why the post emphasizes things like upgrading from 240 to 360p video where CPU is not the bottleneck.
I'm having this problem, sometimes also outside the browser, and I would like to know: how do you figure out if a playback is using hardware acceleration at all? It's not an easy thing to debug.
First, for youtube, right-click -> Stats for nerds. Watch the "Dropped frames" counter, with hw acceleration it usually stays close to zero("-").
For youtube&everything-else, you got two major tools:
1. CPU usage, just watch it, if you don't see spikes while playing video, you're good(on quad core 4th gen Intel core I get around 7% increase in load for HW-accelerated youtube 1080p video).
2. Even better: Tools like GPU-Z[0] have "Video Engine Load" meters for some GPUs(e.g. Nvidia GTX), where you can see if the GPU dedicated accelerator is working(and how much). For other GPU-Z(e.g. Intel HD) you can monitor the general "GPU Load", you will see an increase there as instead.
P.S. In addition to the above, in some browsers(in chrome, via chrome://flags) & media players you can disable the hardware acceleration support and observe the results using the tools I've mentioned.
Where are hardware encoders and acquisition devices besides Red? Even Android phone are not recording to VP9. Truthfully I do not see the entertainment/broadcast industry moving to VP9. Most places are archiving in a high bitrate intraframe codec and using H.264 as a proxy. Satellite is moving towards HEVC to save bandwidth. I don't see production houses flipping the content again to VP9. The only way to get traction is to get more hardware vendors on board to acquire content natively in VP9. Once there is content more hardware decoding and software editing platform support will follow. Hopefully they will have some big announcements next week at NAB.
Oh, oh. Is that why YouTube videos have been spinning up my CPU fan lately? I'll have to try h264ify and see if that helps things. (I'm running Firefox on Arch Linux x86_64.)
Which hardware do you have? I'm on a Sandybridge Intel and I haven't managed to get satisfactory hardware acceleration on Arch + Firefox. Chromium manages to run any kind of video, including 60FPS 1080p without any lag, while on Firefox I get either audio or video lag. No idea why this happens.
I just tried h264ify and I don't notice any improvement on Firefox. Chrome CPU usage seems to be a little lower.
Chromium: I get around 140% CPU usage, spread on 3 processes, video smooth without dropped frames.
Firefox with MSE+HTML5 video: Goes from 30% to 170% CPU, on single process, but the video gets stuck for 3 seconds every second of play time. Totally unwatchable, unless I drop down to 30 fps.
is it possible to mimic hardware decoding by passing the stream through an intermediate software layer that uses OpenCL or CUDA for the hardware assist?
Your link is just to a software layer that takes advantage of any hardware present on the system i.e. on an intel system with vp8 hardware decode it'll use that, if not then it'll use software.
edit: on a second read I see your link intends to create a GPU accelerated VP8 decoder backend for those libraries, not just create the interface layer.
Is it a firefox/chrome problem, or a hardware/driver issue?
To my knowledge, webm hardware acceleration designs has only been available to hardware manufacturers since around 2011. I also could not find a open mozilla bug ticket regarding hardware acceleration.
Hardware/DSP/driver issue. Nokia was on the complaining side when others tried to get VP8 assigned as the video tag default codec, because they had multiple products in circulation that only supported H264 decoding in hardware.
> missing phrase in the post is "hardware acceleration"
Well, yeah, but they focus first on the significance of this for the 240p to 480p crowd for which that is less relevant.
Touting impressive-sounding and strongly-needed improvements of an activity many people engage in, evangelizing, Google is taking advantage of what may be a position to influence hardware to enable, for example, the 80% of Indian youtubers who can only watch 240p youtube clips to watch 480p now with this Google math.
And those who can only watch 480p instead can enjoy silky smooth 720p, and those who just cannot get enough pixels and want 4k but don't have Google Fiber just yet can now enjoy VP9 WebM 4k video on their Nexus 6 without excessive battery drain, as others with 4K displays could do too if they had some HW acceleration, which, while not written in the article, is visible to someone like you and someone like a hardware engineer at Apple and some codec whiz of the Surface and Windows Phone guys at Microsoft.
For all you enthusiasts, Google says? Here's your ffmpeg how-to link, right in this blog post. No need to worry about patents either y'all! And this VP9 vs H264 side-by-side is to write home about, ain't it.
Note that VP9 users are not quite edge cases, as they claim 25 billion hours of youtube was watched in the trailing twelve months. Not sure how many unique visitor sessions that is but that's a big number.
So the blog post, in addition to proclaiming this magical, newsworthy math most relevant to that list of countries I'm glad I don't live in, plus Google's running the youtube show, plus Google's web properties, plus Chrome and so forth, they can add some pressure to the hardware guys, the gods of official standards and of course the browser foot-draggers that make other browsers, to help Google help everyone else make the web faster.
> Where can I use VP9?
> Thanks to our device partners, VP9 decoding support is available today in the Chrome web browser, in Android devices like the Samsung Galaxy S6, and in TVs and game consoles from Sony, LG, Sharp, and more. More than 20 device partners across the industry are launching products in 2015 and beyond using VP9.
The Nexus 6 doesn't support hw acceleration of VP9? So you likely can't watch 4K Vp9 without stuttering and if you can it probably isn't without excessive battery drain
For people with good vision, a 4K 6" screen held 18" from the eyes will look better than a 1080p screen, all other factors being equivalent. Held further than about 18", 4K won't make a a difference for most people.
For a 60" TV that sits 10 feet away, you're not going to see much difference. Move that TV to 4 feet away, and you'll appreciate the extra pixels.
For virtual reality, and other applications that put the screen much closer to your eyes, the extra pixels will make a huge difference.
I completely agree if you talk about virtual reality, but here we're talking about VP9 or HVEC compressed video, which will also need to be resampled as the display isn't 4k but something in between 1080 and 4k.
At 20/20 vision, we can resolve ~300 pixels/inch at 10-12 inches from the eye. [1]
For a 6" screen, that works out to ~1800 pixels. So maybe not 4k, but the next step down from 4k generally is 1080p, which is less than the figure I gave above.
So yes, there is a point.
(And that's not getting into other factors either, such as panning and stroboscopic effects.)
Though one thing I will still mention is panning. While you're panning, at sufficiently high framerates you're still effectively having 1 pixel for every 2. (Look at what happens when you draw a one-pixel-width horizontal line. In general, with non-integer position, it'll take up 2 pixels high, in some proportion.)
Still not an argument for quite as high as 4k though.
What's the point of the Nexus 6 having 1440 vertical pixels in landscape mode? Nexus 6's 1440 pixels is a third of the way between 1080p and 4k. So, presumably, scaled down 4k video would look better on it than scaled up 1080p video when the device is held closely.
A very high pixel density makes a real difference when you're looking at printed text and, in general, images with very sharp lines.
That's not what VP9 or HVEC are good for anyway, so I would bet that scaling down 4k video or scaling up 1080p (for the same bitrate) won't make any noticeable difference.
But of course that's just a matter of opinion unless we do a blind test with real use cases.
The Qualcomm Snapdragon 805 in the Nexus 6 fully supports VP9 hardware acceleration. Though I'm not sure about the benefits of 4K video on a 6 inch screen.
I swear last I checked the 805 only supported hw accel. VP9 up to 1080 not 4K. The Qualcomm spec sheet says it supports up to 4K but reports out in the wild suggest that is "best case" and it doesn't really work beyond 1080.
The VP8 patent pool threat never materialized, but Google secured a license anyway to get other companies to stop complaining about MPEG-LA. A deal is in place for both VP8 and VP9, not necessarily because it's needed, but because MPEG-LA wouldn't stop libeling VPx.
I'm not sure how YouTube estimates the proper video quality for your internet connection, but they're definitely doing it wrong. I have a 150 Mbps connection and I can watch Full HD videos without buffering. This was already the case on my old 16k connection but in both cases YouTube defaults to 480p. This is really annoying and I always end up setting it to the highest quality.
Another problem seems to be YouTube's player that just sometimes stops to play although the video is completely loaded, or you try to skip or go back a few seconds and it just stucks completely.
Luckily I found this Firefox extension that just uses the native player and allows me to use high quality by default:
I find this to be a worse user experience. I would rather wait the 5-10 seconds for the high quality stream to buffer than have every video look really crappy for the first minute, then slowly get better. I guess I'd prefer a consistent experience overall. I stopped watching Netflix on my TiVo and switched to only watching it on my AppleTV because of issues like this.
The results of this study can be explained by 2 plausible hypotheses:
* Users have been trained to expect that if a random site spins the beach ball for a few seconds it's unlikely to get better. Akamai customers are a mixed set of smaller web sites, and these often have an "eternal progress meter" as failure mode of video players.
* Viewers in the Akamai mix are impatient because content includes a lot of emailed links, and ancillary (eg news article related) content
Neither of these would apply to Netflix and Youtube
So would I, but my guess is that, like me, you were stuck with dialup long ago. It drives me crazy when I can't pause a video to let it buffer anymore, but knowing at least one other person like me might help. :)
Apple TV does the same thing, except for some reason it starts high, and then drops down if necessary. During heavily congested times of day on Comcast, I would find myself starting at 1080p and then two seconds in freeze and drop down significantly.
And thanks to dash, even when you seek back to the beginning, it won’t reload it properly. So either the video just breaks, or the first 10 seconds are forced in 360p. Or, sometimes, it just breaks, but clicking ten times on the 0-mark somehow lets it play again.
Too bad they can't get rid of DASH, because DASH is the real problem behind buffering, jerky playback and other nonsense. I really strongly dislike how bad the video playback experience is and has been for quite some time on YouTube.
It completely breaks fast forward and rewind. Why do they even leave the playback indicator interactive if it's going to cause the video to re-buffer every time you even look at it funny?
I watch a lot of youtube these days and when I'm sub'ed to a channel, I shouldn't have to deal with buffering just to skip excessively long intros that I've seen for the 50th time.
I agree it's annoying. In its effort to save seconds of bandwidth from each video, Google has made everyone's Youtube experience worse.
Fine, don't load the whole video ahead of time, but can you just load the first say 30 seconds? Or maybe even - and I know this is going to sound crazy - a whole minute ahead of time?
I remember a time when it seemingly didn't load even 5 seconds ahead of time. That's just ridiculous. Fire the bean counters who are in charge of that short-sighted idea.
I remember a time when the flash player loaded the entire video to a temp file and you could just go through /proc/$firefox-pid/fds, find it and dump the video that way.
The behavior is different for different resolution videos I think. Instead of seconds, I think they download x kilobytes ahead of time. Just try 360p vs 1080p (for same video).
It's a limit on how much it buffers while playing. It does not improve the playback experience in any way, and causes significant degradation when you seek the video.
DASH is purely an XML format for describing server-side content. The issues you are describing have nothing to do with DASH and are due to implementation decisions in both the JS player and the underlying MSE object.
I am not sure why anything uses DASH, but the problem isn't just limited to it. Http video streaming is quite hard to do well, a lot of clients struggle[0]
If it interests anyone here, I have written a Firefox extension that forces YouTube videos to play in an external video player (mpv in my case). You can get it here (only for Linux):
HA, I did similar thing for Opera 12.xx on windows with mplayer :-)
I see you are using youtube-dl. I did too at first, but quickly swapped for YouTubeCenter plugin - it extracts link to mp4 directly in the javascript, my additional javascript grabs that and substitutes YT player window with big PLAY button with a link to custom protocol "magnet1:https://r2---sn-2apm-f5fee.googlevideo.com/videoplayback?sou....
Copying the way browsers open magnet links was the quickest and easiest way I found of passing links directly to a third party program. I simply made a new one, called it magnet1, and directed it towards \AppData\Roaming\smplayer\mplayer.bat %1
cut first 9 characters (magnet1:) and run mplayer, that again was the only way I found of starting new program without keeping command line window open. No delay between clicking and starting video, no unnecesary command prompt window with youtube-dl running. I also have a second custom protocol that starts yet another bat file sending link directly to my TV - one click on a page and YT clip starts playing on the TV in the next room :o)
Feels like duct tape and glue, but works flawlessly :)
I was using youtube-dl, but no longer. Now I just pass the URL directly into mpv, which has its own youtube-dl hook.
Side note, oh boy do you need to get some Linux in your life. That's the hackiest thing I've ever seen on Windows for something that'd take minutes to implement on Linux.
That still takes time to refetch yt page and parse for mp4 link. Smplayer can do it too, but it takes ~1 second, thats ~1 second too long for me :P. If I wasnt so lazy I would find youtubecenter code responsible for this parsing, but I am, so I simply
Got plenty of Linux in my life (mostly embedded, for example running my DIY TV build from recycled 32' LCD panel, universal chinese lcd controller and rasPee. Yes, I like repurposing junk). Linux is how I learned to cobble up together ugliest hacks imaginable.
Watching YouTube using Chrome on Windows is the double battery killer. Chrome doesn't let Windows go into low power states and YouTube is serving up video that can't be hardware decoded.
Firefox supports VP9 but currently does not enable VP9 with MediaSourceExtensions (MSE) by default. You can enable the current implementation via a flag set media.mediasource.webm.enabled to true in about:config
Youtube only used VP9 with MSE but even then not all content is available as VP9, some is only available as H.264/MSE.
Thanks. I did all of that, and I can see that all the formats are checked in my Firefox now. The only question now I have is, how do I know which of these formats is my YouTube video currently playing in? (It does show the HTML5 player now).
FWIW after enabling MSE, playing a 4k video, and opening 'stats for nerds' and switching away and back to firefox (Iceweasel 31.6.0) on Debian jessie the browser locks up at 100% cpu usage.
Thanks for the info. After visiting the Youtube HTML5 support page and clicking "Request the HTML 5 player", and enabling the setting in FF, the VP9 playback is working in FF.
This still doesn't do anything about YouTube's horrible bitrate settings and "resolution is everything" attitude. 4K videos play better on my 1080p screen than 1080p videos, which should absolutely not be the case with reasonable, non-skimpy bitrate settings.
Since so much of youtube consumption is audio-only, why not have a codec optimized to show nothing, or a still picture? Here at work we used to remote into a machine just to play youtube songs over the speakers. No video needed.
Video codecs are good at compressing stills, so there's very little overhead in playing such a video when only the audio is desired.
That being said, YouTube does also encode audio-only versions of all their videos; they're not available from the web player though. You can use external software such as youtube-dl[1] to take advantage of it.
[1]: https://github.com/rg3/youtube-dl/ (using an invocation such as "youtube-dl -g -f bestaudio [url]" to retrieve the URL of an audio-only encode)
I don’t think that audio-only version comes from Youtube. I think youtube-dl downloads the whole video, then uses ffmpeg to extract and possibly re-encode its audio channel.
No, YouTube does have audio-only streams. You can use "youtube-dl -F [url]" to view the available formats and their codes.
They're meant to be paired with the video-only streams used for some resolutions (most notably 1080p). Even browser extensions such as YouTube Center can download the streams, and they certainly don't use ffmpeg.
(In fact, I do use ffmpeg to combine audio-only and video-only streams via youtube-dl, e.g. "youtube-dl -f 137+141 [url]" which takes 1080p video and 256k aac audio.)
I think they have done this for background playing on android mobile/google play. Perhaps a port is in order, or I do not fully understand the technology.
It would be nice if they had an actual video comparison; I can make either of two codecs that are even close in performance look way better with single frame-grabs.
I am skeptical of the quality claims without evidence. Previous third-party studies on VP8 and VP9 have said that both H.264 and HEVC (formerly H.265) outperform VP9 on video quality and encoding speed[1].
Table VI. Encoding Run Times for Equal PSNR_{YUV}, HEVC vs. VP9 (in %): 735.2
I.e., HM is over 7x slower than VP9's slowest settings. VP9 speeds have improved dramatically since then, and it is now being used real-time (Google is working on adding it to WebRTC), while still demonstrating significant quality improvements over prior generations.
In my experience h265 and vp9 are fairly close in encoding speeds (less than 0.1x times the length of video duration).
I would strongly argue that h264 can ever outperform vp9 in video quality (given same number of bits), however, hardware/browser support will decide whether vp9 is adopted or not.
It's still H.265. Like past standards, HEVC/H.265 is a joint effort between ISO's MPEG and the ITU's VCEG. They're different identifiers for different organizations.
It's early days though, e.g. only PSNR for now, but the basic idea of repeatable, open source codec testing so that if someone complains about any aspect they can fork and show the difference, and hopefully get it rolled back in to future versions, is very cool.
I haven't been following codecs for about 5 years now, but objective metrics were mediocre at best for comparing codecs the last time I was into it. The issues with PSNR are myriad and well documented.
I've always thought PSNR gets a bad rap. It's not that PSNR isn't a very blunt tool, it's just that every other tool in it's class is pretty blunt too.
I mean it compares a video codec by treating it as a series of independent still images. That right there is crazy, but the same basic methodology applies to most of the alternatives, even the crazy obscure ones that no-one really uses because they're too new or processor intensive.
The Mozilla/Daala team have written a lot about these topics in regards to their work on a) Daala, b) netvc (new IETF codec project just getting started), c) evaluating improvements to JPEG for MozJPEG, d) evaluating WebP
In the end, like unit tests, performance benchmarks, static analysis or various other software development tools, they're useful if you use them wisely and know their limitations, and dangerous if you abuse them or treat them as if they are magical.
But crappy tools that can be easily automated fill an important part of the toolbox and I feel PSNR has it's place. I'm deeply suspicious of anyone who looks down their nose at PSNR because they've just discovered SSIM for example, which seems to be a common sentiment. They're both just crappy tools that can be used for good or ill if you know what you're doing, running them both (and others too) might help catch more bugs than either alone and if you're automating then why the heck not?
Okay, that sums up how things were about 5 years ago; PSNR and friends can catch some problems, but we still don't have great objective metrics for video quality.
[edit]
Some well-known, but hard to measure objectively psychovisual effects are that some kinds of noise are well tolerated in areas of high detail, and very large loss of detail can be tolerated in high-motion areas.
Both of those are much older encoders, of course (nearly 2 years old in the case of the PCS paper), and some of the settings they used were questionable, e.g., what HEVC calls "constant QP" actually varies the quantizer based on the frame type, while VP9 really uses the same quantizer, which can make a big difference on metrics. Talking to the authors at PCS, they re-ran results later with more relaxed QP settings, and VP9 got somewhat better, but still didn't catch up with HEVC (on metrics).
Keep in mind all of these results are from people who have spent significant time working in MPEG/ITU/JVT/JCTVC, and may have some inherent biases. Google's own results looked much better: http://ieeexplore.ieee.org/iel7/6729645/6737661/06737765.pdf... (sorry for the paywall, I'm not aware of a free version available online, tl/dr: 30.38% better rate than H.264, 2.49% worse than HEVC), but obviously come with their own set of biases.
I don't know how to explain the discrepancies in the two sets of results, but they at least demonstrate the magnitude of the differences you can obtain by varying how you do the testing.
That iphome paper used a random git checkout from the day the bitstream was frozen, and claimed it was an official release, which always struck me as somwhere on the stupid/devious spectrum.
The first actual VP9 release was about 6 months later and even then I dont think they'd done much tuning for speed. Some parts of Google get criticised for building open source products privately, but others that develop in the open have their openness abused.
I ran some experiments to investigate the discrepancy.
The largest differences were caused by:
1. A variable quantiser being used for HEVC, but not for VP9 (as you described)
2. Keyframes being forced every 30 frames for VP9 in the first paper
HEVC also had I frames added every 30 frames, but these were not IDR frames, meaning that B frames were allowed to use the information from the I frame in HEVC.
However, in VP9, true keyframes were forced every 30 frames. The way VP9 works this meant that every 30 frames it encoded both a new Intra frame, and a new golden frame.
Making both codecs use a true fixed quantizer and removing the forced intra frames made the results more like Google's own paper.
I guess the moral is to not force frequent keyframes when encoding with VP9.
At some point comparing highly efficient codecs with metrics like PSNR becomes meaningless, I'm not sure why so much emphasis is still put on it. Verification against human preference is the way forward.
I believe methods like A/B preference testing have been done for audio codecs for a few decades, idk why video didn't catch up.
In this respect I look forward to the work done by the Daala guys, seeing how they come from the highly successful OPUS audio codec seem to be very mindful of perceptual optimization.
>At some point comparing highly efficient codecs with metrics like PSNR becomes meaningless, I'm not sure why so much emphasis is still put on it.
Because (some) codec developers have put a lot of effort into optimising for it, and simple numbers are easier to market to people who haven't actually compared the codecs themselves. You can see how successful it is from the many posts in this thread that assume VP9 gives you higher quality at the same file size than H.264.
On2 codecs were known for relatively heavy blurring that improved PSNR stats but looked subjectively worse. Most comparisons I've seen indicate this is still the case with VP9. Of course it does depend a lot on the source material - some things make it more noticeable than others.
I am curious: is YouTube going to re-encode all existing content with VP9? If so, I am wondering if they do anything special to preserve as much quality as possible. Are there any techniques to do that? (For example, certain JPEG transforms such as rotation can be applied with zero quality loss.)
Yep, there's mention of it on an old blog post [1], but you can verify for yourself on the Enhancements page when editing any video - there's a "Revert to original" button.
VP9 spec has been finalized in 2013. So it's hardly unfinished. There may still be optimization potential for the encoder, but that would only speak for VP9 because it would only get better.
I think making a comparison without providing information how they obtained the results is much worse than the particular choice of codecs pitted against each other.
How much CPU time did they burn on each encode? Which settings were used? did they pick a frame at random or was cherrypicking involved? Or for that matter, which encoders did they use?
Thanks for the additional points, you're correct. In addition to the non-scientific comparison (I'd expect better from a data-driven company like Google, but this is a marketing post), it also completely ignores HEVC, which is making headways in the wild.
Don't get me wrong, I think the "patent-free" VP9 is really nice, but I'm afraid it won't win if it's not better technically, especially if HEVC-patent pool prices are reasonable. Or maybe the only goal of Google is keeping patent licenses affordable by maintaining competition ?
> Or maybe the only goal of Google is keeping patent licenses affordable by maintaining competition ?
That certainly is plausible considering that webm/vp8 probably played a role in the MPEG LA extending its royalty-free exemptions internet broadcasting.[1]
It also provides google with some flexibility. They can (and do) use VPx now, might switch to HEVC in the future if it proves favorable and could switch back to VPx if some negotiations were to break down. Having options usually is a good thing in itself.
Gotta love google, yet another unaccelerated codec shoved down my core2 laptops throat to stutter at 10fps while mplayer plays 1080 h264 version just fine.
I'm using an extension[0] to force youtube to serve me with h264 in order to be able to watch 4K videos & maintain reasonable battery life.
[0] https://github.com/erkserkserks/h264ify