Hacker News new | past | comments | ask | show | jobs | submit login
Bypassing YouTube video download throttling (0x7d0.dev)
587 points by 0x7d0 on Aug 14, 2023 | hide | past | favorite | 227 comments



> To bypass this limitation, we can break the download into several smaller parts using the HTTP Range header. This header allows you to specify which part of the file you want to download with each request (eg: Range bytes=2000-3000). The following code implements this logic.

Last time I read discussion about it in yt-dlp repo [1], you can actually bypass it by just adding range=xxx query parameter (not header), and it will return to full speed even if your range is just the whole thing.

And IIRC YouTube have already lifted this restriction.

Edit: find the ref [1] https://github.com/yt-dlp/yt-dlp/issues/6400


I've never tried YouTube, but I have downloaded videos from sketchier streaming websites using the web developer tools.

Almost all of them have the same protection: some code that triggers only when you open the tools and stops the video by creating a debugger statement you cannot skip and triggering some cpu-heavy code (probably an infinite loop, although I wouldn't discard cryptominers). More importantly this code also clears the network request information, making it more difficult to analyze the traffic sent so far. Note to Firefox devs: enabling "persist logs" should persist the logs. Don't clear them!

None of this is perfect and I never found a video I couldn't eventually download (timing attacks ftw), but I do wish I could find a deeper explanation on how this all works.


>Almost all of them have the same protection: some code that triggers only when you open the tools and stops the video by creating a debugger statement you cannot skip

If you missed it, not so long ago there was a submission that evaded exactly this. Their solution is so simple yet effective: Recompiling the browser with the debugger keyword renamed. Made me smile.

https://news.ycombinator.com/item?id=36961445


Honestly recompilation shouldn't even be necessary, browsers should just let us disable the debugger statement when "clever" sites start taking advantage of it. This usage of debuggers to circumvent our tools is abuse and should be literally impossible unless we consent to it.

Our computers are our realms. God giveth and god taketh away.


Just wait until you won't even be able to open devtools on a WEI protected website. For your security, of course.


Can't wait when websites start voluntarily cutting themselves out of non-WEI traffic. This means we can take it and serve it, and we will.


Chrome Developer Tools allows this. There's a button in the Source tab to deactivate breakpoints from debug statements. The button looks like an arrow cut in half.


This doesn't work for debugger statements.


You're wrong, it very much does work for debugger


I believe the reason you and the person you're replying to disagree, is that it works for existing debugger statements, but seemingly doesn't work for debugger statements injected after you toggle the flag.

The "disable breakpoints" button is seemingly performing a one-time imperative action on toggle (maybe "find all the breakpoints + debugger statements in existing loaded code, and patch/unpatch them") rather than being a declarative state-change in the system (i.e. "disable this tab's access to the debugger/breakpoint logic itself").


I love this exchange. It’s so symbolic of how discussion and disagreement in general works


If I recall, you can work around this by putting a conditional breakpoint on the debugger statement and then setting its return value to just `false`.


You just need to disable breakpoints in the 'Sources' tab of dev tools and refresh. Then you can get the .mp4 or .m3u4 url that can be used with yt-dlp.


> Their solution is so simple yet effective: Recompiling the browser with the debugger keyword renamed

Not exactly what I'd call simple...


Conceptually simple at least :-)

Definitely a clever hack


Try Anti-Anti-Debug [0]. It's a simple extension to bypass those kinds of anti-debugging techniques. Made it in a few hours a while ago for similar reasons.

0: https://chrome.google.com/webstore/detail/anti-anti-debug/mn...


Can you make one for Firefox too? ;-)


it's a simple script, should be portable to firefox with minor modifications. The source code is available on Github [0].

0: https://github.com/Andrews54757/Anti-Anti-Debug


I don't think this will work out-of-the-box in Firefox. Firefox handles the console object differently than Chrome, especially when it comes to the console.clear() method.

In Firefox, when a console is cleared, it's not just the display that's cleared. The entire console object is reset. This means that any modifications made to the console object, such as redefining console.clear(), are lost when console.clear() is called.

You can try this on this demo [1] website that uses devtools-detector library [2]

[1] https://blog.aepkill.com/demos/devtools-detector/

[2] https://github.com/AEPKILL/devtools-detector


This is great. I was wondering why does it disable console.table, but then I went to one of my favorite shady sites I screw around with from time to time (I intercept and patch setTimeout/setInterval to disable particular function doing constant debug() injection), lo and behold tables dates and big arrays (not yet covered by your tool) spamming in console :o slowing down devtools window to a crawl.

For console.clear I have this

    window.console.clear = function clear() {console.log(clear.caller, "Asshole script called console.clear()")}
hoping I get a glimpse of what exactly is calling it.


Than you for this valuable tool; Looking forward to using it!


> some code that triggers only when you open the tools

I've seen this technique too, but I feel that this is a major flaw on the browser's side. It should be impossible to tell if the dev tools is open or not. Surely this can be done right?


A LOT of things should be impossible for the website to tell. Focus, drm, mobile/Desktop

All (3) Browser engines are anti consumer spyware. There is no way to convince me otherwise. Why google and apple are not fixing this is obvious but Mozilla just keeps disappointing.

I'm desperately waiting for a Foss alternative.


There is no reliable way to tell if they are, so developers have to either listen for sudden page height changes (which usually happens when the dev tool is open, however undocking it fixes the problem) or use one of the few dirty hacks with the Javascript objects, but these are getting patched every year.


One trick that sometimes works for me on Firefox is shift+right click the video -> This Frame -> Open Frame in New Tab. and the dev tools work there


Sometimes, especially if you're on the mobile version of the site, the video will be an actual <video> tag you can open in another tab.

Also, the "Open With"[0] browser extension can detect a video/frame's URL when you right click on it, if you want to use that to quickly open it in yt-dlp/mpv.

0: https://addons.mozilla.org/en-US/firefox/addon/open-with/


Have you tried using wireshark to analyze the traffic? That's the first idea I had since you said these pages are trying to detect your browser's developer tools.


You can use something like mitmproxy or HTTP Toolkit to examine things exactly, as those can't be interacted with or detected like the dev tools can.


You can also use mitmproxy to remove or add debugger statements.


How does that work with certificate pinning?


It doesn't, you'd have to unpin certificates in whatever application you're using. Browsers generally don't do certificate pinning though, they'll (correctly) respect your installed certificates, which would include the mitmproxy one.


I see, thanks.


I just use the logs of my filtering proxy.


What proxy do you use?


In the endgame you will have to pay in WorldCoin to keep watching your screen, and you can only earn that untradable WorldCoin by viewing the ads, monitored through the mini Orb embedded in the screen.

You didn't truly believe this was about UBI did you? We solved that one ages ago with bank accounts and KYC.


Nothing with the eye?


Anyone else having throttling problems with yt-dlp lately? I always watch youtube through mpv that uses yt-dlp on the background, but last week it's been terrible. It starts quickly (I've throttled it to 500kBps so that's the starting speed) but then after a while I'm getting a second of stream for three seconds of download, so I gotta queue up the video long time before playing. I'm using git version of yt-dlp and haven't noticed anything related in the git issues.


mpv only uses yt-dlp to get a video URL, then passes that URL to ffmpeg. ffmpeg doesn't implement the workarounds with range headers, so you get throttled. It's possible to have yt-dlp perform the download, pipe it to mpv and make mpv play from stdin, but it breaks seeking to parts that haven't been downloaded yet. There are many issues about this in the mpv issue tracker.


YouTube changes small things in this process all the time. I used to work on an internal editing tool for YouTube videos that needed the MP4 files. Every month or so the editor would break because of a YouTube change and I needed to dive into the debugger to see what they changed and adjust to that.


On a sidenote am I imagining things or do videos actually look a tiny bit better in YouTube?

This really has puzzled me. I downloaded a few favourites and watch them on VLC or Infuse on my AppleTV. In the YouTube app I can use the "nerd stats" to confirm I am viewing the exact same video/audio streams...

... it could be my imagination but it seems like YouTube does a really subtle kind of filter that makes the "blocky" compression artifacts smoother. It doesnt ehance edges or anything - my guess is it looks for areas WITHOUT edges where there are sublte shfts of colour, and it makes the blocky artifacts less prominent.

... it's really subtle and I still cant tell if its just my imagination, like my OCD thinking that my downloaded video doesnt look as good and yet, I noticed on YouTube the video feels more vibrant and solid. When I watch my downloaded vid there are these really sublte, but noticable artifacts often in the background , in the shadows and these constant tiny little jitters even on a 1440p video - make the final picture look not as good.

Am I making this up?

Audio wise there is definitely a change as well. YouTube audio is always more or less level for me, while a downloaded video always needs to crank up the volume which is annoying.

I wish players like VLC or Infuse did whatever YouTube does to make videos just more pleasant to look at. I dont think YoUTube changes the colours or does any kind of vibrancy filter though I may be wrong, but it does things to "level" audio so that you have a more consistent experience going from one video/channel to another.


I can answer about the audio.

YouTube (and most other streaming sites like Spotify etc) use something called ReplayGain. It's essentially a tag that specifies the calculated average loudness of the video/song/whatever (this number is calculated at upload time).

Upon playback, the official YT client knows to use that tag and adjust its volume level accordingly, but I'd imagine either the tag isn't getting downloaded, or perhaps MKV doesn't support ReplayGain tags natively.


The loudness information is derived from the file in the first place, so a smart enough player can re-derive it and normalize audio on the fly. My player (mpv) does this by calling out to libav. It can handle both live audio and prerecorded audio, using different algorithms for each case. This functionality can be enabled with config flag `af="acompressor=ratio=4,loudnorm"`. I will admit that I copied these options from examples without really knowing what they do or how they do it, but they make things much more pleasant.

It's disgraceful that even major movie studios often do such a bad job with audio mixing that I indiscriminately run everything through a filter. This is not a problem with the files I'm using; I find the same thing in the theater. C'est la vie.


Thanks! I've always compared playing music on YT and playing lossless music in Rhythmbox on Linux and wondered why YT sounds better. Now discovered the ReplayGain toggle in Rhythmbox thanks to your comment.


Interesting. Indeed there is the line in the nerd stats overlay:

    Volume / Normalized 100% / 100% (content loudness -0.2dB)
You got me thinking now. I see there is some kinda ReplayGain postprocessor plugin for yt-dlp, however it's tied to some music downloading. I wish yt-dlp had a builtin option of some sort to process that tag.


That explains why YouTube on my TV is consistently 10 decibels louder than all the streaming services.


Interesting! TIL!


What filter the player uses for scaling (and chroma scaling) affects sharpness. Playing a native resolution stream fullscreen should eliminate differences here for monochrome edges. Depending on OS you might be able to toggle rapidly between two fullscreen apps to test for differences (cf ISO 29170-2, which recommends 5 Hz).

Colour shifts (as in, input != output, not gradients) can come from bad handling of video colour space or monitor profile. Also, shenanigans here can have screenshots looking different from the actual application.

'jitters' might be dropped frames, but then you mention resolution. Since you also mention edges, if you're noticing pixellation in the edges of coloured objects, that would be nearest-neighbour chroma upscaling, which I do remember some player using at some point.


I'm "eyeballing" switching between Infuse and YouTube on an Apple TV 4k 2nd Gen (2021). The Sony TV is 1080p but I've been chosing 1440p now in YouTube as it definitely looks better overall (more details).

The downloaded video is also 1440p, same audio/video streams as far I can tell. So both Infuse and YouTube will do some scaling to the viewport 1440 > 1080p.

This is an example video:

https://www.youtube.com/watch?v=-VE-tgVOZN8&t=3m30s

So the left side is really dark and that's the area where you'd typically see more compression artifacts right? Due to the algorithm thinking there's no detail there. So in YouTube it feels solid. But in Infuse I notice tiny little jitters there and it just distracts and I'm guessing it's those really subtle "grainy" things that take away from the picture feeling really clean and smooth.

Now when I switch back to YouTube and I really look for it, at same time stamp I can notice some artifacts, but it's just not as noticable... so I'm still wondering what is going on. Does YouTube also do some kinda brightness/contrast filter perhaps similar to audio? Due to playing back on a TV maybe?

Without being able to make screenshots it's really hard to tell since the time it takes to switch between the apps you get flashes or brightness/dark and the eyes are affected by it. All I can tell is in YouTube the picture just feels smoother and cleaner overall.

edit:

Another example gives some hitns perhaps

https://www.youtube.com/watch?v=-VE-tgVOZN8&t=6m32s

So now I am checking out the video on my desktop linux with a 1440p monitor, in YoUTube and in VLC player (Ubuntu, AMD GPU).

So interestingly in the background behind Ally to the left on VLC you can clearly see the banding of blue colours, there are lots of squares which are jittering, like the fuzzy grain on a night cam. It's really distracting.

In YouTube (same desktop, via Google Chrome), there is colour banding in the blue background to the left of Ally, but it is not as noticeable because it's like the squares have been averaged and the edge of the bands is smoother. While you can see some tiny shaking there in the colour bands if you look for it, it is not distracting from the overall image.

Hmm.

UPDATE /SOLVED?

Ok after redownloading 1440p 271 stream (VP9) I can confirm the color banding is the "smooth" one I saw in YouTube.

Something's fishy with YouTube I did download the VP9 codec ~3 weeks ago and since then they have changed the streams and the bitrates are lower. There are these new "6xx" streams while the old 1xx/2xx streams appear to have a lower filesize.

But oddly enough the 400 MB filesize I just downloaded has the smoother nicer picture, whereas the 500 MB file I downloaded weeks ago, has the squares/fuzzy/grainy effect. VLC tells me both are VP9 so hmm.


Ah that makes sense - the first video was only uploaded 3 weeks ago. It's known that, on youtube, formats become available in increasing resolution after upload as they finish encoding. Your experience now shows that the higher-res streams are encoded in a rush at first and later replaced with better compressed versions.

.

I've never seen such a long format list in youtube-dl before. Are 6xx new? Apparently,[1] they were introduced together with that 'premium 1080p' this April.

Comparing older 4k videos to your video: this one[2] now has 6xx, and 4xx are gone, and curiously the reported bitrates of all streams have since changed (reencoded?). 137 stayed about the same this time, but 18 dropped from 730k to 493k. For this video[3] 4xx are still available.

.

616 is not the actual premium 1080p as the posters at [4] think, is it? Currently youtube.com chooses 248 when playing [2], but yt-dlp can list and download 614 and 616 without any account cookies.

Rather, 6xx seem to comprise (of) vp9 spanning a medium, high, and sometimes low bitrate in all resolutions. Is yt considering replacing the older formats with these?

I just hope the original 18 and 22 remain for older videos, where any difference in quality also matters the most. When still available, in most cases the H.264 streams with creation_time prior to ~2013 are dramatically clearer than any more recent formats.

[1] https://github.com/yt-dlp/yt-dlp/issues?q=605+604+603+sort%3...

[2] https://github.com/yt-dlp/yt-dlp/issues/1863#issue-106877303...

[3] https://github.com/yt-dlp/yt-dlp/issues/389#issuecomment-103...

[4] https://github.com/yt-dlp/yt-dlp/issues/6770

lets not discuss how often the h264 streams for new videos are higher quality than the vp9


Have you used `mediainfo` to check the codecs, bitrate and encoding settings? Curious what you'd find there...


Hey thanks I didn't know about this tool.

Interestingly since I have the older download from ~3 weeks ago I just ran mediainfo on it, and the new download from today. Then I just switch tabs in terminal so I can easily see what changes.

Old / New

    vp09 / vp09 

    filesize   488 mb   / 391 mb

    OVerall bitrate   3274 kb/s   /  2620 kb/s

    Bits/(Pixel\*Frame)    0.036 / 0.028
I don't see anything else significant.

It seems to me in recent weeks YouTube has changed the streams, added new "high bitrate Premium" streams (edit: WHILE lowering the bitrate on the older existing streams like 137!), and perhaps the one I downloaded earlier, despite being a larger filesize, was not encoded correctly?

The new file despite being 20% smaller (~400 mb instead of 500), has the smoother color banding, and doesn't show the ugly jittery grainy artifacts.

I used to think larger filesize is better but I guess Iḿ going to get the Vp9 from now on...


I have noticed that Youtube often re-encodes videos, replacing the data for a certain stream type/number with a new encode.


Yes that is probably what it was!

I'll remember that even two+ days after the initial upload, there may still be reencoding. I thought this was only few hours after the creator uploads.

So the smaller filesize also is not necessarily a sign of lower quality. In this case it's because it gets a quick first encoding like zip with fast compression, and then it gets reencoded presumably by something much more CPU intensive.

That said from 500mb to 400mb is a bit dodgy but what do I know maybe VP9 is that good.


Interesting! Thanks for running mediainfo on them to compare.

When encoding videos there's a "speed" parameter that basically tells the encoder to spend more CPU time compressing the frames. It results in a smaller output size while maintaining the same level of quality but takes much longer to encode. I'm sure that's part of what YT is doing here, an initial quick encode to get the video live then another pass to reduce filesize for long term storage. Good find!

I'm guessing that the perceived higher quality in the smaller file is because the encoder was able to find a more accurate way to represent those frames with the same or fewer number of bits since it has more time to search for the optimal encoding.


A single youtube video has a couple of available video, audio and combined streams. It might be your browser picks a different stream to what youtube-dl (or its fork) takes by default.

I'm not sure what the defaults currently are for youtube-dl and its forks, but for a long time it defaulted to the best combined stream. However the best distinct audio and video streams are higher quality.


Players like VLC do much better and let you make your own arbitrary adjustments via GLSL :)

In particular, I like the anime4k shader pack, which is ML-based but runs in real time in mpv (and I think VLC as well). While it is tuned for anime (as is obvious from the name), it has decent denoise and deblur which often make YT content more watchable and a restore step that does a really good job with compression artifacts but is a bit too tuned for anime so may not always work, or even make things worse. See https://github.com/bloc97/Anime4K/releases


Maybe the video pipeline in the browser is different compared to the one VLC uses? Can you do a simple HTML document with a <video> element referencing your local file to compare?


If anyone's still reading this or anyone cares here is what I found out:

- the issue I was experiencing is with a lower quality encoding vp9 stream - from a video that was recently uploaded (explained below)

- though I didn't trust smaller filesize VP9 streams initially, they look in fact noticably better - the picture is smoother, cleaner, the artifacts of compression are less visible. Where AAC can have jittery/glittery distracting dots moving in the background in areas where you have subtle gradients (eg. a plain wall) - Vp9 has none of these, those areas look smoother and cleaner and it gives an overall nicer looking picture without compromising the detail as far I can tell

- Opus audio stream appears to have less of the ReplayGain issue, I'm not sure - but since I downloaded Opus instead of the 140 m4a stream I notice I dont need to adjust the volume compared to viewing same video in YouTube - and since the codec is newer anyway and the filesize is relatively the same or a tad smaller - also it is in 48k not 44k, I am going to download Opus from here on

- a very confusing thing is it appears ; for a recent upload ; you can have an initial VP9 stream of say 500mb which is in fact no better than the AAC and havs the grainy artifacts - and the vp9 stream gets replaced weeks later by one significantly smaller like 400mb vs 500 mb !! and looks way better . whic hsuggst there was a first pass with low quality encoding, replaced by a higher quality encoding later - therefore my assumption that larger filesize is better was wrong


Doesn't youtube ultimately use the browser's video player now that flash is no more?

In addition to what the other poster said about checking if you're downloading the exact same codec etc as you're watching, you could try playing the downloaded video with the browser and see what it looks like.


Most websites use JS to tell the player what to do, so the JS decides what streams to play based on asking the browser what it supports.


macOS has an AI video upscaling algorithm that was introduced in version 12


Any more info on this? I can't find anything via Google or on the Monterey features page.


I don't understand your reply.

Why don't you just take a screenshot at the exact same timestamp? It's that easy and would have taken you less time than writing this up.


I'm watching with AppleTV, comparing Infuse with YouTube. So I can only "eyeball" it. Wish AppleTV let me do screenshots it would be fun!


I don't know what Infuse is but I could screen record Apple TV from the web browser just fine back when I had that subscription. Otherwise, try a VM


I think he means he's watching YouTube on an appletv device.


> On a sidenote am I imagining things or do videos actually look a tiny bit better in YouTube?

I'm not [ADVERTISEMENT] sure, because my YouTube viewing [ADVERTISEMENT] experience nowadays is so [ADVERTISEMENT] [ADVERTISEMENT] frequently interrupted with ads that it [ADVERTISEMENT] breaks my focus. #pleaselikeandsubscribeandclickonthenotificationbell


Get sponsor block and ad block extensions or a client with it built in and never think about this again


I block on PC, but on phone and TV I'm stuck with their damn ads.


Or you know, pay for a subscription.


I can't, even if I want to. Premium is not sold in my country. So, more creative solutions are called for.


Youtube without Premium is unbearable for me. I rather pay a few € than submit my family to the advertising terror, and even the creators get more (6x, any yt creator here that can chime in?) revenue from a Premium view.

Imagine we could have Internet Premium and never see ads again.


I am glad to pay 7 € for YouTube Premium Lite.

For a while I didn´t consider the "full" Premium because I was always shown the 17 € version - which is for family.

Now suddenly I'm being shown there is a 12 € option which is for single person.

Still for someone like me who doesnt care about music or movies, I'll stick to the Lite version.


I'm constantly suprised when YT deploys another half measure against downloaders when GOOG also owns widevine. I wonder what is their reasons for not using it.


The software version of widevine would be so thoroughly broken in a very short time that it would be bypassed by any one-click downloader addon. Nothing would change for users and YouTube would have the overhead for widevine.

Using the hardware based version could cause a lot of problems with unsupported devices.


As another commentor pointed out, Widevine is already in use on many streaming sites. Where are the one-click downloader addons for those? If they exist, I have not been able to find them.

(My use case is not so much in downloading a video, I just want to be able to my Amazon Prime videos on my Linux computer at a better resolution than 720p.)


Not exactly one click but there are Python scripts where you basically put in your credentials and the TV show ID: https://github.com/search?q=widevine&type=repositories&s=upd...

I used results from this Github search page (especially pywidevine) a while back to make a Hulu downloader in Go which I eventually had to remove: https://github.com/chris124567/hulu

I wasn't really a part of this scene but there appears to be some sort of weird competition among people involved in writing this kind of software so occasionally device keys would leak when they tried to get at each other which was great for me because it meant I didn't have to extract keys from a phone or NVIDIA shield or anything annoying like that.


You are probably already using the software version of Widevine (L3) to watch 720p content in the first place.

There are relatively easy ways to download L3 content, but they are not as common because higher resolutions are available on illegal streaming sites and torrents. These sites break hardware Widevine or use other attacks to get the material.


While that sounds reasonable, is there any evidence that this is the actual reason? Widevine is also used by other ubiquitous services such as Netflix, Prime, Hulu, etc.. and yet WV still remains an insurmountable barrier for most people, even in its weakest form (L3).


Good points. The difference is that on streaming sites, L3 widevine will only allow low resolution playback. They are also tied to a paid account, making it much more difficult/risky to download content. Even if the encryption is broken, there are often ways to detect such users for a limited time.

For professional content, there is simply not much demand to develop a user-friendly way to break L3 widevine, as the market is already served by reasonably convenient illegal streaming sites and torrents that allow for higher quality video.


I don't think youtube really cares if you pirate their content or use a 3rd party client.

What they care about is you wasting their bandwidth. For an ad-supported video streaming site, bandwidth is normally more expensive than revenue - Google only manages to make it just about work because they have probably the worlds cheapest bandwidth due to being able to bully ISP's into peering with them for free. (they don't let you peer with Google for just Google Search but not youtube).

All these throttling measures are simply trying to reserve most of the bandwidth for real users, not people scraping all the content.


Sooner or later, youtube will do the same thing as twitch, which is to dynamically splice ads into the video stream - making it impossible to block with current mechanisms.

They don't do it yet, probably because they don't see the need quite yet. But i have no doubts that it will happen sooner or later.

Adblocking will have to evolve to a new level to block such things.


Very surprised they haven't started this long ago. One might suspect the problem is ad play and click accounting.

Anyway, when they start delivering ads in-band, the next step for blockers is to identify that first keyframe in the player by using a pool of shared signatures, right? So then player clients will need adblock plugins which will have a sizeable signature distribution infra and grief for clients.

Then the anti-blocker might begin adding, per-play instead of per-video, a pixel or something to throw off the signatures, massively increasing THEIR video distribution infra. Ad infinitum?


> Ad infinitum?

AI controlled adblocker is the end game!


Sponsorblock is almost this... But it uses real human labour to replace the AI, and works really well.


Sponsorblock is absurdly wonderful, but needs an option to exclude certain channels. I want to watch Internet Historian's ads.


The feature already exists [1].

[1]: https://github.com/ajayyy/SponsorBlock/issues/547


>real human labour to replace the AI

ah yes, AAI (Artificial-AI) AKA I (Intelligence), or "Crowdsourcing" if you're looking to use an older buzzword. I do think there's a few models trained on sponsorblock already, but they're not great.


...which will be countered by AI-enabled midroll ad generation. A neural network that splices two video clips together must already be a thing, right? The advertisers would probably want this even without adblockers, since everyone already has an adblocker in their mind called 'inattentional blindness'. Using AI to subtly segue into the ad rather than cutting would stop some portion of users from tabbing out, checking their phones or just going AFK during ad breaks.


I think a lot of people watch YouTube on their TV or phone or on a browser without an ad-blocker. I rarely watch YouTube on my laptop.

I think Twitch users tend to watch on their computer mostly? And I think Twitch viewers are more techy so they would be more likely to have ad blockers.

I have no data on any of this. I'm just throwing shit at the wall.


Then how do they report back to the advertisers that the ads weren't skipped?

Edit: Or make them unskippable?


which is to dynamically splice ads into the video stream - making it impossible to block with current mechanisms.

Many VCRs could do that, and stop/start recording to skip ads, as that was the only way to do it.


splicing video is pretty easy - there are certain points - key frames - where video streams can be spliced with nearly zero computational overhead, no loss of quality, no loading delay, etc.


i dont think what's stopping google is a technical difficulty problem, but a scale problem (as well as a lack of real need atm).

I suspect that google doesn't actually lose too many to blockers, as mobile accounts for a large fraction of youtube's traffic (and so far, not that many people actually use a hacked youtube client to view videos).

It's probably cheaper and faster to have a pre-encoded video, cached at the edge.


Adblocking is very much on googles radar. But they realise it is a cat and mouse game - and whenever you start playing that game, you run the risk of ending up in a position worse than you started with. They currently get 80-90% of the ad impressions they try to display, which is pretty good compared to a hypothetical future where someone like Microsoft makes an adblocking-by-default browser and courts force Google not to block them.


Indeed, we've done exactly this with production quality adverts where we'd add real-time information (e.g. betting odds) into the ad at selected points.s


> I wonder what is their reasons for not using it

- Introduces a decryption step, which is slow

- Forces software video decoding, which is slow

- Web browsers only support the weakest form of Widevine which is ineffective

It would effectively push a significant portion of their user base off the platform while not being very effective in its goals.


> Introduces a decryption step, which is slow

It's not slow, after you decrypt the AES key, then you are using hardware AES instruction set supported by most CPUs currently.

https://en.wikipedia.org/wiki/AES_instruction_set


Relatively speaking. You might not notice it when playing back a single video, but I promise you that you're not going to have a good time if you try to play back multiple high bitrate videos on slightly older hardware (think HTPC).


> Relatively speaking. You might not notice it when playing back a single video

So like 99.99% of use cases?

> but I promise you that you're not going to have a good time if you try to play back multiple high bitrate videos on slightly older hardware (think HTPC).

That could be the case, not disagreeing here, if you use old hardware, have multiple high bitrate videos and multiple streams at once but this is specialized niche example.


> - Forces software video decoding, which is slow

No it doesn't. Heck, forcing software decoding is actually one of the ways to force Widevine down to lower protection levels on general purpose hardware.

> - Web browsers only support the weakest form of Widevine which is ineffective

It's not fullproof but it would certainly make tools to bypass it clearly illegal in most of the world.

There are efficacy reasons for not doing it on a backend level, but Google has required anything that wants YouTube to support Widevine for a very, very long time now.



Read up on meaning of word "most" when you feel the need to snipe contrarianism without adding to debate.


Things such as libdvdcss or libaacs are maintained by VideoLan in France. It’s not a random country in this context.

https://www.videolan.org/developers/libdvdcss.html

https://www.videolan.org/developers/libaacs.html


That's not contrarianism, most of the EU is in this situation. Basically everywhere where you have taxes on private copies, you must still be able to make the copies somehow, otherwise the tax would be repealed.


> No it doesn't.

If you're using a PC with a Nvidia GPU, run `nvidia-smi dmon -s u` and start playing a random Youtube video in Chrome. You'll notice how dec% moves from 0% to at least 2%. Pause, and start playing Widevine protected video and notice how dec% stays at 0% because decoding is happening on the CPU.

> It's not fullproof but it would certainly make tools to bypass it clearly illegal in most of the world.

Good luck, copyright infringement is already illegal and yet that hasn't stopped it from being widespread. Tools and techniques to bypass Widevine L3* are widely known and available (yes, even on GitHub).

I was being generous in my previous comment. In reality, deployment of Widevine L3* should be shunned at least as much as Proof-of-work cryptocurrencies. It's completely ineffective in protecting content, it burns unnecessary CPU cycles multiplied by (potentially) billions of users, and significantly degrades user experience.

Even Widevine L1* is ineffective in practice. Techniques to bypass it aren't available to the average Joe, but of course there are groups that will download, decrypt, and re-upload the newest 4K streaming releases to torrent trackers within an hour of them appearing on streaming services.

*edit: Mixed up L3 and L1


> If you're using a PC with a Nvidia GPU, run `nvidia-smi dmon -s u` and start playing a random Youtube video in Chrome. You'll notice how dec% moves from 0% to at least 2%. Pause, and start playing Widevine protected video and notice how dec% stays at 0% because decoding is happening on the CPU.

It's because Widevine have embedded decoder into its lib and its using CPU instructions but from user perspective it's not a huge change on modern CPUs as most have specialized instructions to handle decoding of H264 etc.

> Widevine L1* is ineffective in practice. Techniques to bypass it aren't available to the average Joe, but of course there are groups that will download, decrypt, and re-upload the newest 4K streaming releases to torrent trackers within an hour of them appearing on streaming services.

There are no "Techniques to bypass it", the only way currently to get L1 streams is to use legit hardware keys from some devices, on which you can exploit secure enclave/extract HW keys.


> but from user perspective it's not a huge change on modern CPUs as most have specialized instructions to handle decoding of H264

There are no "instructions to decode H264", there is dedicated hardware acceleration like Intel QSV and AMD VCN, but these gets bypassed just like Nvidia's decoding acceleration from my previous example. All of this is trivially observable, playing back DRM-protected video wastes an obscene amount of resources, relatively speaking.

From user perspective you'll notice stuttering, unusually high CPU usage, dropped frames and more, especially once you try to play multiple videos at once.

> There are no "Techniques to bypass it", the only way currently to get L1 streams is to use legit hardware keys from some devices

That's exactly what I meant. Being pedantic over my choice of words isn't very productive.


> There are no "instructions to decode H264", there is dedicated hardware acceleration like Intel QSV and AMD VCN, but these gets bypassed just like Nvidia's decoding acceleration from my previous example. All of this is trivially observable, playing back DRM-protected video wastes an obscene amount of resources, relatively speaking.

For L3 you are just using SIMD/vector instructions compiled for specific platform, so they are specialized CPU instructions (not general use) that help with decoding. And L3 is mostly now 720p and 1080p low bitrate on majority of streaming services that people use, you would need to have VERY old hardware to not be able to use it. I've been watching 720p/1080p h264 videos 15 years ago with only CPU decoding without ANY issues, most of the world did. So that's just not an issue. If we are talking about L1 then you have hardware acceleration so your point is invalid in that case.

> From user perspective you'll notice stuttering, unusually high CPU usage, dropped frames and more, especially once you try to play multiple videos at once.

Yeah because 99% of people are playing multiple widevine videos at once on their 20 years old hardware... come on.

> That's exactly what I meant. Being pedantic over my choice of words isn't very productive.

Im not being pedantic, you are not bypassing a lock in a door with a key, don't you? "Hey honey lets bypass our neighbour door lock using his key so we can enter his house" No one says things like that. If you meant what I meant then you just used wrong words to describe that. Your choice of words have different meaning which isn't very productive.


The main reason is that it hits their CDN cache efficiency a bit, which costs money, and it's another set of key management systems to look after and operate and at YouTube's scale you want to minimise them as much as possible.

It also paints a very big target on Widevine Level 3's back.

But ultimately it's just a financial equation. What Google are losing from ad blocking isn't quite worth pulling the WV lever yet, but given it has clearly become enough to take softer measures and the pressure they are likely under from music labels I expect a wider rollout will happen in the next few years.

If I were to predict it will probably initially be "any video containing label music or studio clips picked up by content ID", and maybe an opt in tag for other creators at first. They're the ones that are much more useful for monetisation anyway, and you don't lose all your CDN benefits at once.


Google probably wants to discourage third-party clients but allow people to archive. Quietly taking this half measure is the perfect and only solution to achieving this goal.


Youtube is compatible with a lot of platforms, old browsers, ancient smart TV's, ancient android, etc - I would guess widevine isn't.


Widevine has been a mandatory requirement for any OEM pre-installing YouTube for something like seven years now. There is not much out there that Google would care about EOL access for.


Widevine is used in some youtube videos. Not all, though; not even a high percentage -- I've only seen it in certain music videos. I'm guessing it's on a paid license basis…


Maybe they care about not cutting off devices that don’t support it. TVs, ARM Linux, etc. While making downloading videos just annoying enough that people don’t bother.


I always wondered How YouTube distribute videos l, it's the smoothest video platform, even on when I had crappy internet it worked fine also not all.platform works well in south America.The closest to YouTube it's Netflix but lacks behinds a lot.


They have cache servers deployed into the ISP networks close to your home. The ISPs allow them to do this because they also benefit from it: Their bottleneck is the connection from their own network into the wider internet backbone, and having cache servers for the big CDNs takes a huge chunk of load away from that bottleneck. Here's documentation about a similar setup for Netflix: https://openconnect.netflix.com/


This is not a universal answer and I think deserves some correction.

1. Majority of ISPs do not host any cache for Google content

2. Credible ISPs do not have bottlenecks at the transit or peering level

3. Netflix makes use of much more local caching but their model works very differently to Youtubes

4. The concept of "internet backbone" does not really translate to reality. Peering is significantly more mesh-like than that, and transit more diverse.

Source: I have owned multiple ISPs, and still do.


Have you ever tried to download videos from YouTube? I mean manually without relying on software like youtube-dl, yt-dlp or one of “these” websites. It’s much more complicated than you might think.


A very long time ago, I've worked on a Perl script to do just that ( https://www.perlmonks.org/?node_id=636777 ). Of course the problem with this sort of scripts is that they keep chasing behind changes youtube makes precisely to prevent video downloading.


Yes, by injecting my own userscript using my (judging by WEI not for long) USER AGENT. I dont even screw around reimplementing their signature/n decoding/throttling functions, I grep for player.js match(/(?:player\/([a-zA-Z0-9_-]+)\/)?(?:html5player|(?:www|player(?:_ias)?))[-\.]([^/]+?)(?:(?:\/html5player(?:-new)?)?|(?:\/[a-z]{2,3}_[A-Z]{2})?\/base)\.js/), then grep in that for relevant functions and call those directly. You could say that from YT perspective everything is 100% kosher, its their own DRM functions unlocking .mp4 link for me :)


Like, press the "Download" button and pay for Premium?


YouTube Premium doesn’t let you download the video file.

Even if you do it from a browser, it is only accessible from the YouTube website run offline as a SPA.

Even then, downloaded videos can only be played for 29 days before having to reconnect most of the time, with some regions restricting it to 48 hours.


I would totally buy Premium subscription if they let me download the videos, and I told Google that. The way it is now - no deal.


Meh, Google know that nobody would pay for that as a selling point when it only applied to a third of videos, and they will never get copyright clearance from third parties for more than that.


I think you misunderstood what I said. I would gladly pay premium/subscription for videos that are OK with it. I don't care about those that don't, they just won't get my subscription.


I think parent means the YouTube app. On iOS (and android/chromos?) you can actually download videos and watch them offline if you got a premium membership. But then the videos are under control by the app.


Yes, and I replied to that. It’s not really an alternative to an actual video download.

It’s not the raw video file, you can only access it from the app or website, and there’s restrictions on how long you can be offline before the app/website will stop you from watching it.


You can't get .mp4 though. It's there somewhere inside YT app's persistent storage but there is no "share" button for it.


Premium is not available worldwide.

I tried to trick google by creating an account using VPN in Europe. I even managed to subscribe for Premium. Even with an active Premium subscription Youtube app won't let me use its features such as downloading, background play and picture in picture (you know, basically everything)

Because "you live in the wrong part of the world". Creating a problem, selling the solution is what they do.

Like, I don't even care about downloading, but Google intentionally cripples their mobile website experience by suppressing pip and background play. It would cost zero dollars not to do this but they did.


Baffling they enforce those fake restrictions in parts of the world the product to enable them doesn't exist.

The fact Apple allows Google to resell their multitasking and PIP features as part of their own subscription is pretty un-Apple.


[flagged]


Not sure what part of not using your suggested command line tools but wanting to just use the official version like everyone else warrants calling someone openly stupid


Most of my grievances are related to mobile experience so I'm not sure why are you talking about mpv.

I barely have any problems in desktop because issues like no pip and background play just don't exist there.


Technically, all interesting.

Ethically, if you don't only think “fuck Google”, I feel like it's reasonable to stop after the first optimization (“pass the real browser test to get regular browser speeds”). There you're not “wasting” any more of YouTube's resources than a browser user with ad-block.

Getting full Gb/s without paying anything feels to me like you're pushing all the ad-blocked users' luck.

But then again, fuck Google I guess?


Technically, everyone pays for connecting to the network. I wouldn't say it's unethical to slam YouTube with full speed downloads but at the same time I wouldn't judge YouTube if they take measures against it.

I'm using a browser extension to show full size of images on social media when hover and Instagram went crazy about it, warning about unusual activities and threatening to lock my account. I looked into it to see if the extension does any scrapping behind but no, it all seems fine. My assumption is that Meta detects requests happening in wrong order which might indicate data scrapping due to the non-standart client behaviour.

So, measures and countermeasures. I guess it's up to YouTube to implement their counter measures and push the scrappers to implement theirs but overall at some point they should be able to limit downloading to the playback*2 speed because a legitimate consumption wouldn't happen any faster on a legitimate client.

But honestly, downloading shouldn't be restricted. The contents are protected by laws anyways and most people have legitimate reasons for downloading videos. It could be for archiving purposes because the video has some value for you, it could be for analysis or it could be about creating content based on the vide content(like downloading a movie trailer to extract parts for you movie review video). The content in YouTube is not created in vacuum and once published they create the new environment where new videos will be made and this creative freedom is important unless we want the current videos be the last videos ever made.


> I'm using a browser extension to show full size of images on social media when hover and Instagram went crazy about it, warning about unusual activities and threatening to lock my account. I looked into it to see if the extension does any scrapping behind but no, it all seems fine.

I've used such an extension and found dragging the curser over your screen might fire like 25 requests. You might simply be rate-limited.


Maybe, I just speculate over the reason of course but I haven't noticed anything strange like that. That said, the pattern of requests from the client should be obvious to the backend, the preventive measures can be designed around it and the rate Is one of the obvious signals.

Removed the extension anyway


I wonder how abusive this is though. In the end you are downloading the same amount of data, but in a shorter time. You are utilizing more bandwidth but you go away earlier.

I think the original browser use case is tuned for the common occurrence of not watching the whole video. But if you intend to watch (and archive) the whole video to begin with then I don't think this eats away Google's bandwidth more. OTOH it probably has more overhead due to the amount of connections.


> I wonder how abusive this is though. In the end you are downloading the same amount of data, but in a shorter time.

In the end you're downloading a lot more data in total, as what ends up happening is you're downloading all sorts of stuff that you never end up getting around to actually watching (data hoarder syndrome). Whereas if you could only watch that stuff during life playback, you'd be downloading much less data in total as there's no bytes wasted amassing a library that never gets viewed.


> In the end you're downloading a lot more data in total, as what ends up happening is you're downloading all sorts of stuff that you never end up getting around to actually watching (data hoarder syndrome).

Well that's an assumption on the intent of downloading. I don't download videos that I don't eventually watch. But typically it's for archival, I just hate my favorite videos going away.


> I don't download videos that I don't eventually watch.

I doubt it.


Well, you go away unless you're another organization that proposes to mirror all of YouTube and eat its lunch.

Google can't only think about humans, it's got to also think about competing organizations. That complicates things.


that's their problem to deal with. why is everyone here going all 'think of the poor corporations'


Well, when they "deal with it", we'll all be complaining even harder. The fact that they've gone relatively light on obfuscation and security against things like yt-dlp for such a long time is basically tacit approval at this point. I wouldn't want to bite that hand.


By the same logic, fare evasion is "metro company's problem to deal with".

It technically is, but if someone makes a programmer's salary and still jumps over turnstiles, I consider it a big red flag over their personality.


I think you're reading it as "as a user, you must not do this because...", when the GP meant it as "as Google, the reason I'm doing this is because..."


That organization would probably parallelize the download on multiple video level, not necessarily on subrange level on a single video.


What is complicated about throttling downloads, let’s say X times the bitrate of the video? I am sure Google’s engineers can implement something ever more complex


Is it technically or financially feasible to mirror YouTube?


yea, i also don't think that the increased rate is a problem per se, but i also doubt that a majority of youtube views cover the entire length of the video, hence downloaders do probably use more data.


Any youtuber, myself included, can attest that a majority of youtube views demonstrably don't. A key factor in finding success as a youtuber is getting better at retention, hence all the ridiculousness and MrBeastness: some people specialize in retention, and they do better.

It's weird from the standpoint of someone who sets out to watch an entire thing, but almost nobody sticks around while watching videos. It seems like the mass of youtube viewerdom are bouncing around like mad, all the time.


not unlike "zapping" tv channels


It's FOMO on a second-by-second basis. No matter what you're watching now you could be watching something better. The UI even somewhat encourages this with the way it displays a list of additional videos below what's playing now.


By default, it's even right next to what's playing now. It's only below when you're in theater mode.


>Ethically, if you don't only think “fuck Google"

Companies do not have ethics, only interests. It's only natural I behave similarly.


> Companies do not have ethics, only interests.

While that's certainly true (although an simplification - they are just managed by people with low ethics), we can do better. If I behaved like managers of Google, Meta or Microsoft, I'd be ashamed of myself.


I despise this philosophy. Megacorps are shit, so we should be better than them? It's a popular theme because it makes one feel like they have control over the world around them: they just have to be better than it.

No. I shouldn't have to work harder myself to somehow cancel out the evil in the world. Because if we were less cowards and calling them for what they are - evil - they wouldn't hold so much power over us.

Reminds me of that scene from Mr. Inbetween about bullies. Bullies exist because we're told to ignore them and get punished if we retaliate, so they get away with it.

--

But without getting sidetracked. Companies have no emotion, no particular sense of ethics, and least of all, they don't need people to defend them, unless they're called a lawyer.


> No. I shouldn't have to work harder myself to somehow cancel out the evil in the world.

Not only you should, but you should also support countermeasures that make it harder for evil, so you eventually wouldn't have to work harder.

If there is one reason, do it for your kids, to show them that a better world is possible.


"Because if we were less cowards and calling them for what they are - evil - they wouldn't hold so much power over us."

Exactly, if we held public demonstrations outside the offices of book publishers, Sony, RIAA, and similar greedy bastards, and it became the norm to snub our noses at their employees then things would soon change.

For instance, we ought to be demonstrating in the streets over how these bastards are hounding the Internet Archive, but we're not.

If we were, then these companies would quickly change their tune and think twice before launching such lawsuits.

Trouble is we're not out there demonstrating. And it's only a tiny minority of the population who actually care about such things—those of us posting here on HN etc.—who do. We're such a small force we couldn't escape from a wet paper bag on the deck of a sinking ship let alone take on the might of these greedy corporations.

Cory Doctorow has said this many times although he's not been as blunt about it as I am. I've followed this for decades and I reckon it's essentially a lost cause.

Even if we could get politicians to agree to change laws they could only do so around the edges as they've signed international treaties, Berne, WIPO, etc. which prohibit signatories from exiting. Any country that left the treaties would have sanctions placed against it.

These corporations have not only won but they've implemented a system that's irreversible, like a ratchet, every one of their cog-like actions squeezes us consumers further and there's fuck-all we can do about it.


Agree very much with this. To mix metaphors, megacorps see you as cattle to be fleeced. Any public good they might do is a public relations stunt or to take advantage of something, e.g. open source development. Yes, there are people in charge, but their duty is to the corporation. Corporations do not have ethics. They cannot have ethics.

A person should be ethical to another person, because that person can reciprocate. A person should only interact with a corporation in terms of legal frameworks, which are also amoral, because that is the only "moral" framework within which a corporation can act. The fictional personhood of a corporation is just that, a fiction.

(There is somewhat of a sliding scale on this, in that a small corporation formed to protect a fruit stand or something is certainly not Microsoft and shouldn't be treated as a Microsoft.)


>Megacorps are shit, so we should be better than them?

To do otherwise plainly makes you shit as well.


> Because if we were less cowards and calling them for what they are - evil - they wouldn't hold so much power over us.

three years of covid nonsense -> crickets

video download speed throttle -> rage

Something is wrong here.


That's because there was a very good reason for that Covid "nonsense" as you put it: reducing infection rates and keeping healthcare systems from collapsing. If you disagree with that, then I'm sorry you aren't living in reality and believe in idiotic conspiracy theories and microchips in vaccines and the like.

(That said, there really was some real nonsense, such as certain dumb countries that penalized people for going on bicycle rides in rural areas by themselves.)

There's no good-for-society reason behind throttling video download speeds.


> There's no good-for-society reason behind throttling video download speeds.

Google certainly would disagree. Their argument (I assume) would go something like: if people don't pay for content up front and also don't watch the ads then these services can't exist, therefore if these services existing is better for society than them not existing, then it follows that <DRM, etc., fill in the blank> is good for society.

You might not like it but a great deal of our economy is built on that premise, so a lot of people and companies have a stake in holding onto such arguments.

Such an argument is a serious and honest one, and should be responded to with some care. Outright dismissal is not interesting.

> That's because there was a very good reason for that Covid "nonsense" as you put it: reducing infection rates and keeping healthcare systems from collapsing.

That definitely turns out to have been false. The models were vastly wrong. We definitely have differential handling of the pandemic, from Sweden, Africa, and some states in the U.S. for example, and those that went all out did not do better than those that didn't.

Many people warned that this was overblown, but also many people greatly enjoyed exercising authority, and others greatly enjoyed a sense of moral virtue ("saving grandma") that was unjustified.


If I behaved like managers of Google, Meta or Microsoft, not only would i be ashamed of myself; I would not leave my lunch or coffee out of my sight.


Allow me to wire 500k+ in your account every year, and you'll see that your morals are easily bought.


No, fuck that. There are many people I know who have explicitly rejected a payoff to do the right thing.

If your morals go away when they actually matter, you didn't actually have morals, you just had an excuse why you weren't already rich. A person is judged by what they do, explicitly and especially when it actually matters.


> It's only natural I behave similarly.

Are you meaning towards everyone? Or are you meaning your behaviour back towards those companies?


I hope he means back toward those companies, in which case he's exactly right.

You wouldn't treat a deranged serial killer with respect and courtesy. Why should poorly-behaved amoral corporations be treated as you would treat normal humans?

Corporations should be treated in accordance with their own behavior. The mom-n-pop shop down the street that uses an LLC for legal purposes and treats you like a valued customer? You should treat them with respect and kindness. The evil megacorp that tries to lobby for shitty laws to screw you over? You should screw them over too. (And alternatively, the big corporation that makes good products for decent prices and doesn't seem to be actively trying to blatantly harm society and twist things to their advantage with legal tricks or lobbying? You should treat them respectfully too.)


> YouTube's resources

> Getting full Gb/s

Technically most of the time video is served on the same network(ISP) as youtube has caches with almost all ISPs, IXes in the world. It might be the largest CDN built to date.

Case in point, the authors YT URLs point to Canadian ISP Videotron.

https://bgp.he.net/dns/rr1---sn-8qu-t0aee.googlevideo.com


technically you should be able to download video and you should not be prevented from this, at least to have a proof somebody was spreading hate speech or insulting you or your relatives. Of course you can camcord the screen but this sounds like a stone age solution.

Do you know for example that you cannot take screenshots of netflix in your browser on Mac, and therefore cannot make a meme, which is a fair use that has been taken from you?


What does hate speech have to do with anything? The content has no bearing on the legality or ethics of downloading or bypassing throttling controls.


It's the cost of doing business?


a good read on HN after a very long time, for me.


I have to agree, it's an interesting topic with a bit of "hacking" masala and just very well written. Can't remember the last time I read a full article here.


> he most popular one is yt-dlp (a fork of youtube-dl) programmed in Python, but it includes its own custom JavaScript interpreter to transform the n parameter.

ah i remember this one: https://news.ycombinator.com/item?id=32793061

i confess i still dont really understand why they had to make this but i'd love to hear the story behind it


Some videos offer multiple audio channels for different languages? Why have I never come across such videos before / missed somehow?


Mr Beast's videos have at least a spanish audio track, which funnily enough NewPipe defaults to (or did the last I checked) and NewPipe doesn't support changing the track as far as I can tell


Just checked NewPipe on a MrBeast video.

There is an option now to select the audio track and MrBeast uploads dubs in more than a dozen languages.


When I first noticed this, I thought it was cool that dub-spiderman[0] migrated to using that right away, since he already went so hard with the mrbeast spanish and other dub channels. I assume it's preferable to have all of your subscribers on the one channel.

0: Jimmy's voice in the spanish dub of his channel is the same actor who dubs spiderman.


It was launched earlier this year. They say that "thousands" of Channels have access to the feature but who knows when the feature will be available to all channels.

https://blog.youtube/news-and-events/multi-language-audio-mr...


As a more relevant alternative to "look I'm rich", a channel you can try this with is Real Engineering. NewPipe released a version just yesterday that now supports choosing the audio track


Chubbyemu videos often have it, and James Hoffman once used it to provide an audio channel with slurping sounds (he's a coffee channel, that's not as dodgy as it sounds) and one without.


Because it's an extremely new feature and honestly only a few channels can afford to make use of it


It's also becoming more common now that Youtube will generate its own AI-voice dubs.


I highly recommend anybody who cares about content on Youtube to download the videos you like and maintain local copies. Material is being deleted or hidden faster than ever and Youtube is only going to get more user-hostile over time.


Any guides to learn how to do a similar analysis on other websites?


I assumed, perhaps incorrectly, that the author is a Google engineer.


Which websites?


No, the broader concept of how to dissect the behaviour to figure out the api endpoints called, what was sent to them, what was returned, etc.


MITM proxy, ZAP, fiddler, and Burp suite are some tools to start with, and wireshark/postman just in case you needed it. Rest is just your knowledge in JavaScript mostly.


I don't think these tools are particularly suitable for reverse engineering websites, it's much easier to use devtools and userscripts


Not trying to be a sketchy contrarian, but why would you do this with JavaScript? It just doesn't seem very fit for purpose...


The most interesting part of this (the bypass itself) involves executing a Javascript challenge. It's very convenient to do that from Javascript (the author mentions that Python implementations need to add a Javascript interpreter). Besides that, it's mostly async I/O which: 1) might be annoying in earlier versions of Javascript, but with Promises and async/await, it's very clear and readable (to me) in Javascript; 2) is the exact case (I/O-bound, not CPU-bound) where Node.js performs efficiently.


You should at least say why you don't think Javascript is fit for the trivial task of downloading files.

Especially in an article that already had to unwind and explain Youtube's Javascript code.


I mean you don't actually think this will continue working a week after it's widely shared, right?


yt-dlp is open source and I'm sure the Google engineers have been aware of it ever since it or it's ancestors were released.


Super cool breakdown, gg!


"Have you ever tried to download videos from YouTube? I mean manually without relying on software like youtube-dl, yt-dlp or one of "these" websites. It's much more complicated than you might think."

This reminds me of some sort of fizzbuzz test. This is not complicated at all. There is no need to use the Range header or run Javascript.

The short script below does not download anything because there is no need. It does not use Range headers, it does not run Javascript and it makes only one TCP connection. With the JSON it fetches, one can simply extract the videoplayback URLs and put them in a locally-hosted HTML page with no Javascript.

    #!/bin/sh
    # usage: echo videoId | $0 <-- this will indicate len to use    
    # usage: echo videoId | $0 len | openssl s_client -connect www.youtube.com:443 -ign_eof
    # usage: $0 len < videoId-list | openssl s_client -connect www.youtube.com:443 -ign_eof
    
    (
    while read x;do
    test ${#x} -eq 11||continue
    if test $# -ne 1;then len=${#x};x=$(grep -m1 ^\{ $0|sed 's/\$x//'|wc -c);exec echo usage: ${0##*/} $((x+len));fi
    
    cr=$(printf '\r');
    sed "/^[a-zA-Z].*: /s/$/$cr/;s/^$/$cr/" << eof 
    POST /youtubei/v1/player?key=AIzaSyA8eiZmM1FaDVjRy-df2KTyQ_vz_yYM39w HTTP/1.1
    Host: www.youtube.com
    Content-Type: application/json
    Content-Length: $1
    Connection: keep-alive
    
    {"context": {"client": {"clientName": "IOS", "clientVersion": "17.33.2" }}, "videoId": "$x", "params": "CgIQBg==", "playbackContext": {"contentPlaybackContext": {"html5Preference": "HTML5_PREF_WANTS"}}, "contentCheckOk": true, "racyCheckOk": true}
    eof
    done
    printf '\r\n'
    printf 'GET /robots.txt HTTP/1.0\r\nHost: www.youtube.com\r\nConnection: close\r\n\r\n';
    )
    
For processing the JSON I wrote custom utilities in C that (a) extract videoIds and other useful strings, (b) generate HTTP similar to above, and (c) filter the returned JSON into CSV, SQL or HTML. For me, these run faster than Python and jq and are easier to edit. Using these utilities I can also do full searches that return hundreds to thousands of results and I can easily exclude all "suggested" or "recommended" videos.

CSV output

1666520150,23 Oct 2022 10:15:50 UTC,22,aqz-KE-bpKQ,"Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film",00:10:35,635,UCSMOQeBJ2RAnuFungnQOxLg,19211597,"Blender"

SQL output

INSERT INTO t1(ts,utc,itag,vid,title,dur,len,cid,views,author) VALUES(1666520150,'23 Oct 2022 10:15:50 UTC',22,'aqz-KE-bpKQ','Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film','00:10:35',635,'UCSMOQeBJ2RAnuFungnQOxLg',19211597,'Blender') ON CONFLICT(vid) DO UPDATE SET views=excluded.views;

HTML output

Looks just like CSV except vid is a hyperlink


I think your definition of "not complicated at all" differs from most people.


This is demonstration of W3C Ethical Web Principles 6.11 and 6.12

https://www.w3.org/TR/ethical-web-principles/


Looks interesting. In the post, the author does

    echo -n '{"videoId":"aqz-KE-bpKQ","context":{"client":{"clientName":"WEB","clientVersion":"2.20230810.05.00"}}}' | 
      http post 'https://www.youtube.com/youtubei/v1/player' |
      jq -r '.streamingData.adaptiveFormats[0].url'
which is very similar to what you do, but runs into an issue of throttling to ~70Kbps. Is the difference just the "key" parameter? Do you get no throttling?


There is no throttling when using the JSON returned by the HTTP request in the shell script or generated by the utilities I wrote.

it's not the key the author is using, it's the post-data.

Moreover, to get throttled videoplayback URLs with the "WEB" key and client info like the author is using, one does not need to make POST requests to /youtubei/v1/player. There are throttled videoplayback URLs in the HTML of the /watch?v= page. For example,

    curl -A "" -40s https://www.youtube.com/watch?v=aqz-KE-bpKQ|grep -o https://rr[^\"]*|sed -n 's/\\u0026/\&/g;/itag=22/p'
It's ironic how the author is claiming this is complicated. That's his own doing.


Interesting. I ran the script to extract the json - that part was almost instant, then i used the first `url` field of `streamingData.adaptiveFormats`. I then ran

    curl 'https://...googlevideo.com...' --output video.mp4
for me the download is throttled to "768k", i assume thats in bits per second and not bytes which is very low: the random video i tried would take 8 minutes.

on the other hand,

    yt-dlp videoIdHere
does its processing then downloads the whole thing in about 5 seconds.

Does that curl command run much faster for you? Or do you do something else?


Is 768k too slow to watch the video from the URL. If not, then I would not call that "throttled". Don't need 500 MB/s to watch a video. From what I've seen, when people discuss YouTube throttling online they are referring to max speeds of 60-70k. That's too slow to watch the video from the URL. Not too slow to download, though. And that's why this idea that YouTube is "preventing" downloads doesn't make any sense. There are download URLs in every /watch?v= YouTube page. Those are throttled. Max speed 60-70k.

Use this post-data and should get same speed as yt-dlp.

    {"context": {"client": {"clientName": "ANDROID", "clientVersion": "17.31.35", "androidSdkVersion": 30 }}, "videoId": "$x", "params": "CgIQBg==", "playbackContext": {"contentPlaybackContext": {"html5Preference": "HTML5_PREF_WANTS"}}, "contentCheckOk": true, "racyCheckOk": true}
I do not use curl, except in HN examples. I generally do not download from YouTube. I use the URLs in the JSON to watch the video.


I should add that with respect to the download URLs in the HTML of every /watch?v= page some will not work at all, namely, in the case of heavily commercialised videos, videos using DASH and some other uncommon cases. But I always found this is minority of linked YouTube videos one encounters on the web.


iOS and Android clients do not yet have URLs with the "n" parameter. This is why specifying the clientName as "IOS," along with the specific YouTube key, currently yields URLs that remain unthrottled.

However, acquiring this key requires decompiling the mobile application, monitoring requests through a proxy, or relying on values discovered by others. It's not necessarily straightforward.

I do agree that the code is simpler this way.

I also find it interesting that, by default, yt-dlp calls the YouTube API three times, initially as an Android client, then as an iOS client, and finally as a Web client. Depending on the video and certain other parameters, YouTube provides different formats to different clients.


"However, acquiring this key decompiling the mobile application, monitoring requests through a proxy or relying on values discovered by others."

This is again not true. The key is in the HTML of every /watch?v= YouTube page. It's a public key; it's not hidden in any way.

Further, it's possible, up until today at least, to use the "WEB" key with clientName "ANDROID" of "IOS" and receive unthrottled URLs. The key in the shell script is in fact the WEB key. The key for IOS is different.

   curl -40s https://www.youtube.com/watch?v=aqz-KE-bpKQ \
   |grep -o \"INNERTUBE_API_KEY...[^\"]*\"


So make a nice, good documented blog post for google engineers to understand and fix this issue?? Whyy


The people who created an entire video streaming platform with these blocks and rate limiting are more than capable of reading open source youtube-dl/yt-dlp and probably have been doing so for years.


There's nothing in there that wasn't obvious to the engineers who implemented the throttling.

1. Solve the challenge the same way the browser would. By actually running the JS code.

2. Segmented downloading. Pretty sure they allow that on purpose so starting or resuming a video feels snappy.


> Pretty sure they allow that on purpose so starting or resuming a video feels snappy.

Also for switching to a different bitrate stream seamlessly.


The only real thing they could try to do is to try enforcing video/IP speed cap. They probably don't due to false positives - for example people jumping around a long video also make may "range" requests.

And the other approach is currently in progress - soon logged users won't be able to view videos without watching(or at least displaying/downloading) adds, so the logical next step is to nerf anonymous(as is without google account) viewing. No matter how I look YT has exact problems (and solutions) as all file locker sites. The only difference is that YT is not at mercy of Ad companies, it is the Ad company(at least Google is). So they might try some more aggressive measures, that normally would get a site banned from publishing ads.


There’s plenty they could do. They could flip a switch tomorrow and limit access to only signed-in users, and they could further enable DRM, as pretty much by now the majority of users are already on DRM-handicapped platforms. Any will-be-called "legacy" users would just get limited to 480p.


> They could flip a switch tomorrow and limit access to only signed-in users

Huge amount of users of youtube are literally babies, they don't know how to read or write so they don't sign in.

I'm starting to also suspect the trend of YTs algorithm insisting on recommending videos you've already watched is also a decision from this huge baby user base.


Their parents will sign in when magic rectangle stops working.


YT-DLP can import cookies to bypass the signed-in user issue. It has to as many videos in the EU require credit card age verification to access.


And YT-DLP could run the extracted Widevine plugin just as Kodi can today but ultimately it can’t spoof attestation and that’s already in the works as well.


So suddenly thousands (millions?) of users are unable to view content on YouTube?

Sounds pretty bad for ad-revenue....


You need a Google account for living a normal life, or perhaps an Apple account but that's about the extent of your options. I'm sure Google can do the math on how many principalistic people are not going to watch videos with ads anymore if they have to log in for it. Most people are already logged in because they needed an account at some point anyway, so the vast majority of viewers wouldn't even notice. Kicking off adblock users won't be a financial loss either, so it's only about the group that was logged out (probably a tech savvy, privacy-aware audience) and didn't use ad blocking (tech savvy privacy people that watch ads? You can count that group on one hand)

I hate to say this because I'd be affected but that's the math I would expect them to use. The only counter argument I can think of, why they might care to keep freeloaders there, is the network effect


They already lowered bitrates for non premium members.


Proof? The news I recently saw, if you're referring to that, actually said they'd offer a higher-bitrate 1080p option for YouTube Premium, not that they'd lock the (then) current bitrate behind YouTube Premium and downgrade everyone else.

Edit: I did some searching. In addition to that, the only relevant news I could find were about Google testing locking the 4K (2160p) resolution behind the Premium doors, which they ceased doing.


I was speaking from anectodal evidence because I've been feeling my 1080p streams look worse than before the new option showed up.

But it seems I'm mistaken [1].

[1] https://www.theverge.com/2023/2/23/23612647/youtube-1080p-pr...


They did lower the bitrate back in 2020, IIRC, because there was so much more youtube being watched all of a sudden and they didn't want ISPs to be crippled under the weight. Don't know if they reverted that change since.


That's what I thought too. Don't give away all your tricks to the enemy; even if the information is already available in other places, the extra publicity won't help our side of the war.


>Don't give away all your tricks to the enemy

That's pretty impossible with open-source software. Google engineers aren't clueless; they'll know where to find this information just as well as anyone here.

It's too bad there isn't a legal way of preventing Google engineers from reading some source code.


I’m almost positive that the engineers already know how the throttling is applied and how it could be circumvented.

Downloading the same file used in the official UI and applying/reverse engineering the function is not exactly rocket science.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: