Hacker News new | past | comments | ask | show | jobs | submit login
Ffmpeg-Python: Python bindings for FFmpeg – with complex filtering support (github.com/kkroening)
220 points by brudgers on Dec 29, 2019 | hide | past | favorite | 29 comments



FYI, the term "binding" is slightly misleading here. This is not bindings for the FFMPEG libraries, it's just an API frontend to invoke the `ffmpeg` command.

For actual python bindings check out PyAV [0]

[0] https://github.com/mikeboers/PyAV


Really nice.

> The corresponding command-line arguments are pretty gnarly:

    ffmpeg -i input.mp4 -i overlay.png -filter_complex "[0]trim=start_frame=10:end_frame=20[v0];\
    [0]trim=start_frame=30:end_frame=40[v1];[v0][v1]concat=n=2[v2];[1]hflip[v3];\
    [v2][v3]overlay=eof_action=repeat[v4];[v4]drawbox=50:50:120:120:red:t=5[v5]"\
    -map [v5] output.mp4
I have actually written scripts that would generate such horrors but north of 2000 characters :].

> If you're like me and find Python to be powerful and readable, it's easier with ffmpeg-python:

    import ffmpeg

    in_file = ffmpeg.input('input.mp4')
    overlay_file = ffmpeg.input('overlay.png')
    (
        ffmpeg
        .concat(
            in_file.trim(start_frame=10, end_frame=20),
            in_file.trim(start_frame=30, end_frame=40),
        )
        .overlay(overlay_file.hflip())
        .drawbox(50, 50, 120, 120, color='red', thickness=5)
        .output('out.mp4')
        .run()
    )


I am not surprised for a script north of 2000 characters for this. Check out Fred's ImageMagick scripts (http://www.fmwconcepts.com/imagemagick/index.php). In the similar concept, they are long shell scripts generating an imagemagick command


Ffmpeg is an incredible tool. I recently did a fun project where I had a long instructional video file that I wanted to create a bookmark file for based on individual lessons. I used ffmpeg's scene detection filter to find the sections in the file and take their screenshots. Then I used Tesseract (OCR software) on the resulting screenshots to get lesson titles. The entire workflow was nicely glued together and automated with Python.


Does anyone know of a way to show real-time video with this in a GTK widget? I love the API, but I need to support both online and offline rendering in the context of deep learning based object detection. So I cut the pipeline in the middle, extract frames, do inference on them, and then feed the frames into the rest of the pipeline, and in the end I draw an overlay over the video scaled up to display size. Currently I do this with Gstreamer, but gstreamer has a horrible API in comparison to this and documentation is quite poor as well.


Most [0] the time you can swap ffmpeg for ffplay to get a preview window. So you could do something like the suggestion here [1] to get the preview window, and then use GTK to embed the window wherever you wanted it.

[0] But not all

[1] https://github.com/kkroening/ffmpeg-python/issues/190


FFmpeg filters are real gems, you can find all details here: https://ffmpeg.org/ffmpeg-filters.html

While the command line seems horrible, once you start playing with them it can be really fun. As always, start simple and build!


I had to use ffmpeg from within python once and gave ffmpeg-python a shot. It works very nicely for a lot of things, but had some limitations that made me write a very lightweight, somewhat OO wrapper around ffmpeg filters. Using it to e.g. generate a silent colored screen with text on it looks something like this: https://gist.github.com/Felk/975c096083e8434a8b5823173f8b22f...

Are there already existing, more polished wrappers around ffmpeg that include enough escape hatches to be able to fully leverage everything ffmpeg supports? I probably just didn't search and evaluate options long enough.


I don't know where it ranks in terms of your question points, but MoviePy is another well liked Python library that seems to wrap ffmpeg, and will also make use of ImageMagick for some functions.

I used MoviePy a few years ago to add subtitles to a short video. It was fairly easy to write a script to place bits of text at specific frames.

I'm interested to learn more about the differences and similarities of ffmpeg-python and MoviePy and will put some energy into that.

https://github.com/Zulko/moviepy


Amazing!

I spent 3 days to write a python script, 2 weeks ago, to generate a FFMPEG command that is able to take N videos, of different sizes, duration and framerate, and use the xstack filter to generate a tiled video, with a 2x2 layout.

Of course one problem was generating 4 series of video, to have the smallest period of time at the end. I solved this pretty easily, although there are more optimal ways of doing it.

One issue to consider is that I had to re-encode all video to a same framerate AND add a proper letterbox (I wasted a lot of time on this), but the filter system handled this pretty well since it worked in a way that did not buffer too much.

Also had to mix the audio, and I did this step separately, and there are no desync. This was very easy!

The generated command was pretty gnarly but I made it work for 30+ videos!

I wanted to try a 4x4, with reddit mp4 short gifs (I have about 500 of them), but powershell stopped me because the ffmpeg command was so large. So I gave up! I wonder if those bindings solve that problem. Nevertheless I should obviously use a unix shell to solve this command length problem.


Big fan of this library. It's much easier than writing these scripts by hand. You can also dump a command line version to use elsewhere with the .compile() function.

For those concerned about extensibility, you can always use .global_args() to pass arbitrary arguments to FFMPEG if necessary. However this library is well maintained.


Wow this is exactly what I was looking for. Previously I used to write 20-30 line long FFMPEG commands to do a task (put it into a docker container for easier reproducibility) but know I think I'll stick with this in the future.


ffmpeg is a stand alone executable, why did you need a docker container?


Easier reproducibility and the company I was interning at did everything with containers so I did it just to learn.


So what's different from PyAv ?


I found gstreamer to be at least less magical than ffmpeg - but the api for gstreamer is somewhat verbose (except for trivial cases like gst-launch strings).


When I needed something remotely similar on Windows, I picked Media Foundation and Direct3D 11. This way the video stayed in VRAM on GPU, I could write arbitrarily complex processing code in HLSL, and the processing consumed very few resources. Was also simple to render the video if needed.

I wonder about ffmpeg architecture, does it run these graphs completely on CPU, completely on GPU, or does it send the video between them?


i feel like suprocess call is an under rated way of using libs like ffmpeg- its how i usually do it. normally you have to learn the cli anyway, so it almost always seems easier to just script the command generation rather than use some middleman abstraction layer


Normally I'd agree, but ffmpeg is a bit of a special case: thanks to its extremely large set of options it has a rather high number of pitfalls: flags that do subtly to wildly different things depending on where they are placed relative to other arguments, for example. Complex filter graphs are pretty gnarly on their own, being a string-based sublanguage all crammed into one argument... it's not totally unreasonable to reach for something just to tame that particular beast.

If this gets you halfway-workable tab completion for the various options of things it's worth it for that alone, vs. diving through the big long page that otherwise contains all the names and orders of arguments to filters.


I use subprocess to turn any single file in > single file out script into a batch processor that takes folders as input and output. I am sort of new to programming and I felt like this was quite dirty, but I didn't have the bindings anyway so I didn't have a choice. What are the downsides and upsides of calling subprocess versus bindings?


Any time the CLI options stop you from doing something that is possible in the library itself, this goes out the window. That's why my friend and I wrote a pybind11 wrapper for tesseract. We could probably do something similar here, and never worry about subprocess again.


It’s how I do it too, but it’s probably less “clean” for people who care about such things.


Reminds me of Avisynth. Everything it does can be done by ffmpeg command line but prettier.


Yeah, the Python code reminded me of AVISynth too! Here's some code for reference: http://avisynth.nl/index.php/Script_examples


Sure does! It seems like people have moved on to Vapoursynth though.


this looks magic.

As an Ex VFX wonk, and someone still working with videos, FFMPEG is both a joy and a massive pain at the same time. Having a simple interface in python makes life a whole boatload easier


Problem with these libraries is they never keep up to date with things like ffmpeg. A few years down the line and you’ll be making system calls to ffmpeg from python


That's what this does


This is cool, but I must admit I'm kinda trying to wean off Python to Rust and was looking for something just like this, but for Rust.

Anyone know about such a project?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: