FFmpeg is wonderful software. Growing up as a Windows user in the early 2000s, devices were far more picky than they are today about which video codecs they'd support. It was a non-trivial task as an 11yo trying to convert DivX .avis into an MP4 my old iPod Video could understand. Discovering ffmpeg and finding that someone was offering for free what I otherwise could only find under mountains of crappy shareware was a real watershed moment.
Lets write a new video compression algorithm that is super efficient - great
this lets us compress movies so they can fit on cheap CD's, instead of DVDs. - great
We can now give those CD's away with movies on them - great
And then every time someone puts one in a DivX player they can pay us to watch/rent it, instead of having to drive to blockbuster - wait, what?
Its easy, we'll just use a phone line that everyone has right near their entertainment center in their living room to phone home at night and send the data of what movies you watched and how many times. - what are they smoking?
They are somewhat related... the codec's title styling included a winking smiley face emoji -- "DivX ;-)" -- as a tongue-in-cheek nod to the failed video disk technology.
DIVX the physical disc distribution company was a famous flop.
DivX the video codec started out as an unlicensed hacked version of Microsoft’s MPEG-4 v3 codec binary. Since it wasn’t a commercial product and was legally dubious, the author called it DivX ;-) with the smiley in the name.
When it became unexpectedly popular during the dot-com boom time, someone of course set up a DivX company that dropped the smiley, eventually rewrote the codec, and presumably acquired the trademark from the defunct DIVX (or just took it over if the registration expired, I don’t know).
> and presumably acquired the trademark from the defunct DIVX (or just took it over if the registration expired, I don’t know).
and then, iirc, this is where xvid come into being. I think it was the same codec just re-written and given back to the opensource world, hence the reason for naming it "divx" spelled backwards.
I remember the default media player that shipped on Windows was absolutely terrible because it could only play a very limited number of file formats, none of which were actually used much by movie files found in the wild. If you wanted to actually play a video, you had to try your luck and choose among several 3p "codec packs" half of which were probably loaded with malware.
People who have always lived in a world with great software like VLC and MPV and ffmpeg underestimate how hard it was to actually play a video file on your computer back in 2000.
I've always said ffmpeg is one of the new wonders of the world. It powers so much, is so complex, so irreplaceable. It's crazy we get to enjoy it for free. I use it to encode my movies and tv shows to AV1/720p.
It's also a testament to the power of open source software. ffmpeg is in a significant amount of software touching audio/video in some way, and we are all enriched because everyone is free to use it.
I convert much to 720p PS3 compliant H264, for maximum device compatibility. I take an external drive with these files with me when I travel, and in 99% of the cases plugging it into hotel TVs just works.
I only play these files on my second monitor as background noise when I'm coding boring stuff. Day of The Dead is 696MB, and looks great on my monitor. https://files.catbox.moe/cm88w0.jpg I don't need more for this use case. 577 movies so far, at 466GB! God bless the AV1 team!
He he, I have a secret aversion to movies exceeding 1GB :) I know what you mean! I wonder what video fidelity I would gain moving to something more modern. It's just really convenient being able to playback anywhere.
The preferred AV1 encoder that ships with ffmpeg, SVT-AV1, is faster than any software H.265 encoder now. You can encode 1080p video in real time at preset 6 (the equivalent of 'medium' -- presets are on a scale from 0 to 13) on a $120 desktop CPU.
We had an old software at work that would only take a specific file format, and of course the files came from multiple sources, which made it really painful for the users.
We transformed a relatively decent desktop into a ffmpeg transcoding machine, which would monitor files incoming from a samba share and it would output the converted file into another samba share.
It was just a bunch of scripts and cron jobs but it worked much better than I anticipated and it was mostly maintenance-free.
I remember Back in 2007, during a far away vacation, using ffmpeg on a windows netbook to convert star trek episodes into a format my little mp3 player could understand and play on its little screen (320x240?). It was amazing that even on the 600-900 MHz cpu, those videos transcoded in a matter of minutes.
I am always surprised how much it's taken over video processing tasks (and maybe how Apple lost the plot with Quicktime and such).
Would def be interesting if someone could write up a history of the project so far. I wonder how much industry input into the OSS commits there are (like MS/IBM into linux, postgres, etc)
The greatest addition to FFmpeg in the recent past was the addition of large language models translating my "ffmpeg command to mix audio file onto video file" into actually executable FFmpeg commands.
Being cheeky of course here. FFmpeg is great. An AI assistant was what I needed to execute my ~12 FFmpeg commands per year though, with ease and speed.
ChatGPT / LLMs really are great with ffmpeg. Not too long ago I wrote a blog post outlining an ffmpeg command I needed (mostly for my own future reference). It almost feels quaint now, given that you can get the same kind of explanation from GPT. I bet if I pasted the command in my article into ChatGPT I would get an explainer that's basically identical to what I wrote.
Seems risky to start running commands on your machine that a hallucinating AI spit out at you without understanding them, but I guess they can at least let you know what to look up so you can double check what the command will do.
It really depends on what you're doing right? Any command line work that I use an LLM to help with is non-destructive. I'll use jq, ffmpeg, etc., but I don't update the initial source.
it is, but typing "no, `--do-x` is not a valid option, please only give me real commands" back into a chat window is a hell of a lot less frustrating than sifting through docs that may or may not even exist.
And finally, after some more tests it turned out that some videos have almost 2x bigger .webm variants than .mp4 so I had to extend my yt-dlp config like so (only showing the prioritization options here, my previous full config is still there upthread):
...OK, you made me double-check. Used this instead:
--format 'vcodec=av01/bestvideo*+bestaudio/best'
And it started downloading `.webm` streams and not `.mp4` (only one video had an AV1 stream). Tested with several videos, quality is the same and some files are actually 40% smaller.
...Whoa. Thanks for making me double-check!
Somebody down-voted you, I seem to recall seeing earlier, sorry about that. Your comment was valuable, though I'd definitely use more productive tone next time.
I am half-aware of the licensing troubles of various video formats but I wish those companies luck going after each and every home video user. ¯\_(ツ)_/¯
Natural language syntaxes universally suck. A faux english syntax isn't easier to use if you don't know which english in particular will be accepted. A complex cli interface fundamentally can't be easy to use. A GUI can fix that by making things discoverable and by integrating the documentation into the UI, but the ffmpeg devs presumably see that as someone else job (and there have been people to step up).
1. ffmpeg exposes all of its options through the CLI, and there are a lot of options. So it's probably always going to be completely undiscoverable. It really needs a GUI to be usable, but that's a project in itself (I guess the project is Handbrake).
2. They probably didn't put a lot of work into the UX of the CLI since it's an open source project.
>So it's probably always going to be completely undiscoverable. It really needs a GUI to be usable, but that's a project in itself (I guess the project is Handbrake).
I don't buy this. A TUI is completely possible, e.g. LazyGit, Htop, etc, countless tools indicate you can get pane-based UIs going in the terminal; the FFmpeg team has simply never made such a thing a priority.
But even without a TUI, the most basic use-cases are well-known after over 15+ years of existence, simple prompt-based wizards, i.e. "git add -p", should be offered; again, a matter of priority rather than being intractable.
A year ago or so someone posted a subscription service they made for composing the pipeline graph visually
There's a low-hanging fruit that I think would make ffmpeg more helpful for regular people.
There's a million terrible websites that offer file conversion services. They're ad-ridden, with god-knows-what privacy/security postures. There's little reason for users to need to upload their files to a third-party when they can do it locally. But getting them to download fiddly technical software is tough - and they're right to mistrust it.
So, there's a WASM version of ffmpeg, already working and hosted at Netlify [1]. It downloads the WASM bundle to your browser and you can run conversions/transformations as you wish, in your browser. Sandboxed and pretty performant too!
If this tool a) was updated regularly b) had a nicer, non-CLI UI for everyday users and c) was available at an easily-Googlable domain name - it would solve all the problems I mentioned above.
Browsers are annoying. They constrict the designer, they constrict the user, they're overly complicated, slow, bloated... I don't know why people keep pushing them to do things they are bad at.
I wish 20 years ago we'd made a concerted effort to make Java suck less. We'd have the universal applications everyone wants but nobody wants to put effort into. But the web was new-ish, and people didn't realize that hypertext document viewers would become an entire application platform and mini OS.
What I'd really like to see is something like FlatPak, but for all platforms. Basically it would be containerized GUI apps, but one repository for one app that serves all platforms. On Android, MacOS, Windows, etc, you would run your "flatpak add https://some/repository/ my-app && flatpak pull my-app && flakpak run my-app" (but in a GUI, like an App Store you control). And that would pull the image for your platform and run it. Since it's containerized, you get all the dependencies, it's multi-arch, & you control how it executes in a sandbox. You could use the same programming language per platform, or different languages; same widgets, different widgets; it wouldn't matter because each platform just downloads and runs an image for that platform. This wouldn't stop us from having/making "a better Java", but it would make it easier to support all platforms, distribute applications securely, update them, run them in a sandbox, etc. Imagine being able to ship a single app to Windows and iOS users that's just a shell script and 'xdialog'. Or if you prefer, a single Go app. Or a sprawling Python or Node.js app. Whatever you want. The user gets a single way to install and run an app on any platform, and developers can support multiple platforms any way they want. No more "how do I develop for iOS vs Windows"; just write your app and push your container.
This proposal is to offer competition to web-based conversion websites. If users are willing and able to find Handbrake and download it, it can work for them. But everyday users are right to distrust software downloaded from the Internet.
Many users are in environments where its not possible to download new software (schools, work places, universities).
The browser has its disadvantages, but it is the most widely-deployed sandboxed execution environment providing incredibly easy distribution of software.
> But everyday users are right to distrust software downloaded from the Internet.
They're right to distrust apps that run in their browsers too, but that hasn't stopped anybody. These days everyone is scared to death of an .exe but will happily execute whatever random code a stranger on the internet comes up with if they only have to click on a link to run it on their devices. Warnings that WASM is a malware author's dream weren't enough (https://www.crowdstrike.com/blog/ecriminals-increasingly-use...) and browser sandbox escapes happen all the time but nobody seems to care. I can't even just pick on WASM, JS isn't much better and even CSS/HTML alone is getting complex enough that it can be used maliciously.
Using it for obfuscation doesn't really change anything. A coder that wants to obfuscate things could already run their own interpreter and/or use asm.js.
I'm quite sure that the reason conversion websites are popular is Google. I don't know if they're just very good at SEO or search engines have a specific policy to favor web-based solution.
If you search for "mov to mp4", Handbrake is NOWHERE to see. The 10th result for me is a Cloudflare article explaining what's the difference between mov and mp4. The ~20th result is a book called Business Funding For Dummies (no shit). Handbrake is after these. The legend says I'm still scrolling trying to find where it is.
How is an average user supposed to know Handbrake or this FFmpeg WASM site?
HandBrake suffers from the same problems as FFmpeg to a lesser extent. I have no idea what options I should be selecting to get the max quality, smallest file size for a conversion.
The presets are useful but when I'm converting an old WMV or some other ancient format I want to know that I'm not leaving anything behind.
If Java had won, we'd be complaining about Java instead of web technologies. It doesn't ultimately matter, when you have a platform as large as the web is, it's going to be complicated and bloated.
"Regular" people don't really need FFMPEG. Regular people need tools with GUIs that have a non-generic purpose. So stuff like https://kdenlive.org/en/ that are backed by ffmpeg are (imo) superior "regular" person tools.
FFMPEG isn't complicated (its as complicated as any other CLI tool), it's that video encoding/decoding specifically is a hard problem space that you have to explicitly learn to better understand what ffmpeg can do. I think if someone spent an hour learning about video codecs, bitrates, and container formats, they would immediately feel "better" at ffmpeg despite not learning more about the tool itself.
I mean, we have, what, 15+ years of StackOverflow posts to tell us what the common use cases are, e.g. "how do I make a GIF from this short screen cap?"; surely FFmpeg could offer prompt-based wizards for that kind of low hanging fruit.
Why should it, though? It's like saying that CPUs should offer high-level OOP primitives. It's just the wrong layer of abstraction. Let FFmpeg focus on offering the features, and let others build on that to abstract it all away into easy-to-user workflows.
≥ "If this tool a) was updated regularly b) had a nicer, non-CLI UI for everyday users and c) was available at an easily-Googlable domain name - it would solve all the problems I mentioned above."
No matter how nice you make it, it will probably still lose the SEO battle against the shitty ad-ladden sites fighting to win top place for google searches of "convert X to mp3"/etc.
That's probably true. In my ideal world, the OSS community comes together in a suite of offline-first, in-browser alternatives to common user needs. We need a site for document conversion (using WASM Pandoc?), PDF merging, image conversion, text utils (lower case, spell check). All these would live on one trusted site. Hopefully users can discover the site and be done searching, and spread its name by word of mouth.
I had to merge a bunch of PDFs for a rental application recently and it was painful. Having to upload very sensitive docs, every site being a funnel to their paid version, etc.
In the past I've used ghostscript to merge PDFs, but it's not user friendly of course. I do like your idea of one site hosting these sort of FOSS utilities with nice wrappers.
How is performance? Right now I use Handbrake to do video optimization. However, I'm not a video expert and all the options are pretty daunting to me. I don't know how to get the right mix of optimization without having the video encoding take forever. I would guess running the conversion in browser would just make everything slower. I wish there was something simpler for video, like ImageOptim. Just drag the video in and it compresses it using the best options based on compatibility needs.
"One of HandBrake’s strengths is its ability to open a wide variety of video formats. HandBrake uses FFmpeg under the hood and generally can open whatever FFmpeg will, in addition to disc-based formats like DVD and Blu-ray."
The options Handbrake exposes are essentially the ffmpeg flags. The built in presets in Handbrake are generally pretty sensible IMO and I've rarely had to deviate.
If it's 2-3x of native code, that's plenty good enough. Everyone uses Handbrake in an async mode - just set up the conversion and let it run overnight.
Of course, a website would be better for smaller conversion jobs (in case you have browser restarts or whatnot). Desktop apps can block computer restarts to a greater degree than websites.
I don't think users can distinguish a "local" website with a public one. So this software should just come with the OS as a "movie maker" package. As mentioned Handbrake UIs are a decent candidate.
I think OSs are generally over bundling helpful software, unless its a funnel to a paid version. The security risk is too high, and so is the legal risk of anti-monopoly action ("why are you killing independent movie editor businesses?").
So I'm trying to build ffmpeg via vcpkg today, and it turned out multiple of its dependencies are transitively depending on liblzma, but the downloading of liblzma source has been disabled by GitHub in light of the recent xz backdoor.
Lasse Collin pulled everything back to a standalone git repo, with all of the compromised code removed: https://git.tukaani.org/, which is now, I presume, the official distribution source for XZ.
Blocking downloads of liblzma seems to me to be an ill-advised decision. Now that the mechanism is known, the dangers are limited, but the educational value of being able to study what has been done is real.
While the dangers are limited, they certainly aren't zero. Even if the original attacker(s) have entirely gone to ground others may be scanning for hosts that managed to got compromised by following the bleeding edge and more could get compromised of downloads from primary sources are kept open.
Keeping the affected code visible somewhere could be useful for research purposes, but you don't want it where people or automations might unwittingly use it. If the official sources where the only place this could be found then it might be reasonable to expect them to put up a side copy for this reason, but given how many forks and other copies there will be out there I don't think this is necessary and they are better off working on removing known compromises (and attempting to verify there are no others that were slipped in) to return things to a good state.
Maybe someone needs a year to audit the history and find all the other backdoors. Who's going to work on it for a year for free or without being in on it, I don't know.
right now I'm sure it's a temporary measure, to limit the downloading of sources.
but I really worry that later this will become normalized first, after every exposed hack withrdraw source availability for a little bit aftewards, just while 'they' check for other attacks or whatever
later on, it'll take longer and longer to put the source back up. but let's hope this is merely my overactive paranoia and everything will be fine open source is still ok.
The obvious solution seems to be adding an extra hurdle, where it warns you the source may be compromised, so you can still get it, but aren't going to just grab it without knowing something happened.
There is value in making sure (potentially) compromised code doesn't just get used normally, but I agree that shouldn't mean totally blocking access to it in most cases.
I use it — because I am writing a rust program, and want to use ffmpeg functionality.
What’s the alternative? I could wrap the C API, and then try to make a nice rust interface from that, but then that’s exactly what this package does, so I don’t want to repeat the work.
I often just exec ffmpeg from whatever language I'm using (as a command line thing). Not very ergonomic, but the nice thing is that it's 1:1 with all examples and other uses of ffmpeg. But I guess it depends on how deep into ffmpeg you're doing stuff. Mine is mostly to point it at doing something non advanced with a file and that's it.
Basically no one rewrites FFmpeg in recent years, in any language, at least not in the open source scene (and judging from the known usage of FFmpeg in world’s premier providers of multimedia content, probably not in the commercial scene either). It’s both too good and too daunting.
Bah, don't give them ideas!
Honestly, codecs are a worrying target for supply chain attacks because they're complex and use a lot of memory-unsafe code. Just look at all the image format attacks throughout history (a memorable recent one being the libwebp vulnerability.)
I’m talking about the underlying libav*. There are plenty of frontends in all sorts of languages, although ffmpeg(1) itself is obviously the most versatile. Also, CLI UX is highly subjective (hence all the different frontends by people with differing opinions), I personally find it more than acceptable for the immense complexity it encapsulates.
I'd say the fact that I can pass in multiple input sources, apply a complex chain of filters, set detailed rendering options, set output parameters, and have it all work flawlessly, and do all of the above in a single command line with consistent syntax, makes "too good" a very accurate description of the FFMpeg CLI.
I used this wrapper to implement an opening and ending detection tool for “fun” [1].
However, it seems that many programs opt to instead shell out to the ffmpeg CLI. I think it’s usually simpler than linking against the library and to avoid licensing issues. But there are some cases where the CLI doesn’t cut it.
I have been using the xstack filter for several years now.
What I do is take several diverse short video segments, like 100, concatenate them into 4 segments (example 23+24+26+27 since they have diverse lengths) and then xstack them into a 2-by-2 mosaic video.
Before, I was doing it in a single stage, but now, after some advice, I do it in 5 stages: 4 concatenate stages and 1 xstack stage.
I have not profiled/timed it so see which is faster, but it works pretty well, although I often have a lot of different weird warnings.
> What I do is take several diverse short video segments, like 100, concatenate them into 4 segments (example 23+24+26+27 since they have diverse lengths) and then xstack them into a 2-by-2 mosaic video.
Just out of curiosity.. what use do you have for a 2-by-2 mosaic video?
Also, it seems there is currently one in progress to drop the "6" qualifier on the ffmpeg binaries <https://github.com/macports/macports-ports/pull/23315/files> so it'll be fascinating to see if any new ffmpeg7 then subsequently puts the "7" back, beginning the cycle again
ffmpeg updates their API very liberally, even in minor version bumps. Combine that with many packages depending on ffmpeg doesn't make it very easy to always have the latest ffmpeg version. They are working on it though: https://trac.macports.org/ticket/65623
(I'm not a MacPorts maintainer, but I've been burnt by ffmpeg API changes a couple times myself before).
I got frustrated having to install all of the runtime dependencies and just wanted an easy way to install the statically-linked version, so here it is.
ffmpeg is such a joy to use, once you make it over the very steep learning curve.
I'm making some youtube videos where I play through Demon's Souls flipping a coin to decide to equip items or not, and I wanted to have an onscreen coin flip animation and sound effect. With some effort, I created a transparent set of frames for the animation. Then with ffmpeg's filter_complex I was able to add the image sequence as a video stream, overlay it over the original video, and add a sound effect. That's on top of the existing subtitles, audio channel merging, and video resizing/compression. All in a single (long!) ffmpeg cli command.
and every other non-sadist would've done this in 2 seconds in <insert 500+ FOSS video editors>. not sure how wrangling a byzantine CLI is a joy to use.
Surprised even MPEG-5 EVC made it. Unfortunately the VVC Decoder didn't quite make it ( Edit : Officially ) . I guess we will have to wait until version 7.1. Still waiting for x266.
The built-in VVC decoder is dreadfully slow (a ton of optimizations are missing), VVdec is at least 2-3 times faster on anything having AVX2/SSE4.
If you really want to give VVC a try, better stay with version 6.1.1 as it's the last one which has patches for enabling VVdec. You won't be able to apply them to version 7.0/git master:
Thanks Yes I meant officially. Was hoping the we could set stage for VVC little earlier. I know VVC is not popular on HN or literally anywhere on Internet but I do hope to see it moving forward instead of something like MPEG-5 EVC which is somewhat dead in the water.
I don't know that having so many codecs is a good thing unless they really add something. How does it compare to av1 (which I was under the impression is coming to be the natural successor of hevc, with hardware support)?
Comparing to AV1, VVC / H.266 is expected to offer 20-30% reduction in Bit-Rate with similar quality at similar level of computational complexity. And it is already deployed and used in real world in China and India. I believe Brazil are looking to use it as their next generation codec for broadcasting along with LCEVC.
ffmpeg is, mainly, a bunch of libraries. libavcodec, libavformat, libswresample, etc, is almost all of ffmpeg. If a project is using those libraries, it's using ffmpeg.
The ffmpeg command line utility is "just" an interface to those libraries.
ffmpeg is a lot more than just a wrapper on libraries. It can do a lot of filtering and rewiring of audio channels, video channels and subtitles.
Handbrake doesn't do any of that. You can't even drag a bunch of audio files on to handbrake because handbrake doesn't do audio, while ffmpeg is great for encoding audio.
What I'm saying is that ffmpeg is the libraries. When you use the ffmpeg CLI to filter and rewrite audio, the ffmpeg command is just decoding it using a decoder in libavcodec, filtering using a filter from libavfilter, and encoding again using an encoder from libavcodec. The ffmpeg CLI is "just" an interface for the libraries.
The fact that Handbrake doesn't expose the same features as the ffmpeg CLI tool is frankly irrelevant.
The fact that Handbrake doesn't expose the same features as the ffmpeg CLI tool is frankly irrelevant.
It's not just relevant, it's the whole thing. They asked for an ffmpeg GUI and someone recommended a GUI that doesn't use ffmpeg and doesn't do what ffmpeg can do. ffmpeg can not only rewire channels, it can stream video, capture video from the screen, capture video from a tv tuner, overlay text etc.
Also the libraries you listed are part of the ffmpeg project. They come from ffmpeg.
to be fair, I have never seen a GUI that could do everything its complex command-line equivalent could do, that wasn't, in the end, just simpler and easier to use the command-line in the first place. so when people ask for "a GUI", I think a lot more information is needed about what it should be able to do.
Never argued otherwise. Specifically, the thing I'm saying is wrong is:
> someone recommended a GUI that doesn't use ffmpeg
Handbrake uses ffmpeg.
But I recognize your username. I don't remember from where but I remember reading or having a conversation with you which went nowhere. I think I'm done.
The thing I have a problem with, specifically, is your statement that Handbrake doesn't use ffmpeg. That statement is incorrect.
I have no problem with the statement "Handbrake doesn't make a good ffmpeg GUI, because it only exposes a small part of what ffmpeg can do". That part is totally 100% fine.
And I have a problem with the response "ffmpeg is a lot more than just a wrapper around the libraries" (https://news.ycombinator.com/item?id=39941964), given the fact that ffmpeg project is the libraries and that the ffmpeg CLI tool is just an interface to them. "ffmpeg is more than a wrapper around the libraries" is strictly speaking true (because ffmpeg is both the libraries themselves and the "wrapper" ffmpeg command line tool), but it doesn't make sense as a response in context.
Do you understand? Or do I need to break it down further?
A wrapper would be some sort of straight pass through and ffmpeg does more than that. That's how you can create big commands that rewire channels, use audio from a different source, subtitles from a specific language, filter, overlay text then stream it all out. That's not trivial, but I don't know why you're getting upset over it.
Because HandBrake uses some parts of the FFmpeg libraries, but HandBrake scope is much smaller than FFmpeg, and while it uses some parts, it's definitely not FFmpeg CLI GUI.
Whenever I search for ffmpeg commands, there is always some person suggesting an 8 liner with 25 arguments, and next to that someone suggesting a command with two -i and one -o argument. On visual inspection both do the same, but I’m always left with the feeling that I did something wrong or just “got lucky” with the shorter command.
I would love a “unless you’re a pro with hyper specific needs, forget these 90% of arguments and only use this 10% in this way” type of guide.
That’s tame for FFmpeg, likely just specifying a bunch of encoder parameters the simpler command left out for defaults, maybe with some input/output streams explicitly spelled out. If you want to look at really incomprehensible FFmpeg commands, try anything with filtergraphs.
My experience is that ChatGPT is dreadful at everything but the simplest ffmpeg invocations, and will often produce command lines with subtle quirks (such as "only works if the input is an even number of pixels wide").
But then again, my experience is that ChatGPT is dreadful at everything but the simplest anything.
My experience is similar, but I find it can still be useful if you go step by step, check its work and explain errors and corrections as you go. Admittedly when I describe it like that it doesn't seem very useful, but if I'm figuring things out by myself then most of that work is a given anyway and the bot helps that process along.
Sometimes... but most often, after I explain in detail an error it did, it will just say "I apologize for the mistake, you are correct, <re-iteration of my explanation of the error>. Here is a version with the error fixed:", followed by either the exact same output or an equally wrong alternate output.
The fact that these glorified Markov chains manage to fool people into thinking they posses some kind of actual intelligence or ability to reason baffles me.
That still feels like a chore, opening some text file, copying the right command, opening a terminal, messing around with input and output file paths..
I don't like interacting with command line parameters in general, it feels clunky to me, but I don't think there's a point in arguing about it since it is more of a personal preference
But all of them have parameters. And depend on each other. And the order of filters matters. There is no way to make a simple interface for all ffmpeg can do.
I don't know if I agree with that. The official docs do list a lot but often you get something like compression_level takes an integer from 1-8. But doesn't tell you if 1 is smaller output or 8 is to work the hardest.
However the wiki pages can be quite good if they cover your use case and the real strength is so many examples online. Even if many of the example command lines feel like they have been cargo-culted through the years and no one actually understands what exactly they do.
No, that's bad advice imho. ChatGPT (and Claude etc.) are all pretty damned good at this. Strongly recommend using them over reading the docs or asking other people if you're getting started.
So this is a good use case for it: you will most likely get immediate feedback if it's wrong, and a bit delayed feedback if it achieved the wrong thing; and you can prod it to try harder. LLMs are best used when you can easily verify the results.
use ffprobe on the file first then paste that output and what you want done to it. Helps if there’s something about the file/codec that GPT would have to work around. Had better success that way.
Agreed with other posters: CLI is the best way to go…
ChatGPT can help with learning a lot now but the mailing lists are incredible sources of kind and wonderful (and incredibly knowledgeable) people… go there!
Handbrake, Permute are super as mentioned… I’ve put down a couple to add to the list.)
There isn’t one. Handbrake very weird about interpreting what you want and you’ll find yourself having to double check dimensions and stuff every time and the queue is very fiddly. On the Mac version at least.
Don’t think anything exists like XLD for FFMPEG video where you can just drop a file in set the quality and codec and get the exact same dimensiond file out every time.
I said "so many little things", not "this one specific little thing". Other things: I frequently get packages that offer themselves to be upgraded but the upgrade mysteriously fails every time I try it. No reliable way last time I checked to tell it where to install packages. It'll happily detect packages that have been installed outside winget (good!) but I have experienced cases where, when attempting to upgrade one of those packages through winget, it instead creates a second installation of it (horribly bad!!!)
Also it's not like apt, all it really does is download and run an MSI with a bit of fancy glue around it (which is why it can break so badly), you could probably install the latest version of this by manually pointing it at the right URL, so it's more annoying for it to be substantially behind.
Though in retrospect I'm not sure why I interpreted this as being "substantially behind"! It was more that someone made a vague winget lament and it struck me as an opportunity to express my evidenced belief that it is terrible :)
First of, ffmpeg is amazing, I'm very thankful to everyone involved in it.
> dnn filter libtorch backend
What's ffmpeg's plan regarding ML based filters? When looking through the filter documentation it seems like filters use three different backends: tensorflow, torch, and openvino. Doesn't seem optimal, is there any discussion about consolidating on one backend?
ML filters need model files, and the filters take a path to a model file as one of their arguments. This makes them really difficult to use, if you're lucky you can find a suitable model and download somewhere, otherwise you need to find a separate model training project and dataset and run that first. Are there any plans on streamlining ML filters and model handling for ffmpeg? Maybe a model file repository with an option of installing these in an official models path on the system?
Most image and video research use ML now, but I don't get the impression that ffmpeg tries to integrate the modern technologies well yet. Being able to do for instance spatial and temporal super resolution using standard ffmpeg filters would be a big improvement, and I think things like automatic subtitles using whisper would be a good fit too. But it should start with a coherent ML strategy regarding inference backend and model management.
I think I read about this a few months ago but don't remember the details. What exactly does this do? Does it result in faster encoding/decoding if you have multiple filter graphs (for example a single cmd line that transcodes to new audio, extracts image, creates a low res)
Loopback decoders are a nice concept. So could I use this to create a single ffmpeg command to extract images periodically (say 1/s) and then merge them into a horizontal strip (using the loopback decoder for this part)?
You don't need a loopback decoder for that. The periodic extraction will depend on a filter, and you can just clone and send the output of that filter to the tiling filter.
Had to go to ChatGPT for help. It appears that you need to know how many tiles to stitch. I was hoping to have that dynamically determined. Not sure if loopback will help.
I wonder if this also means that Chrome and Edge will be able to use this acceleration for their ffmpeg backend (instead of relying on MediaFoundation)?
Moving to c11 is bad, really bad. This is a dangerous road to follow, don't trust ISO on that matter which is literaly doing planned obsolescence on 5-10 years cycle with computer language feature creeps. C has to be simplified then going towards eternal stability, not the other way around. I suspect some toxic/scammy people got in (or brain washed).
I think I did updated my code with the new channel layout API. But it was a year ago at least. There is another API which is supposed to change, the seeking API but I wonder if it is now stable enough to be used.
20 years later it's still a goto. Great tool.