Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Recut automatically removes silence from videos – built with Tauri (getrecut.com)
268 points by dceddia on June 16, 2022 | hide | past | favorite | 114 comments
I released a new version of Recut recently, rewritten from the ground up using Rust, Svelte, Tauri, TypeScript, and Tailwind (RUSTTT stack for the win!). It's the first app I've built with Tauri and I've really enjoyed it.

Some back story: Recut is a tool I built to speed up my screencast editing workflow. It's like a lightweight single-purpose video editor. It chops out the pauses, with some knobs to tweak how closely it cuts and what it leaves in, and lets you get a live preview of what it'll look and sound like with the cuts applied. It can then export to a handful of other editors, nondestructively, so that you can use the full capabilities of a "real" video editor.

It was originally a native Mac app written in Swift, and people kept asking for a Windows version. I had learned Swift and macOS development to build it originally. So as a solo developer, I had some choices to make. Keep it Mac-only? Learn another whole language + UI framework, rebuild the app, and maintain two codebases? Rebuild the app with a cross-platform toolkit?

I'd had experience with Qt and C++ in years past, but I honestly didn't love the idea of getting back into C++ and dealing with the inevitable hard-to-debug segfaults. I'd had more recent experience as a web developer, but I was worried about performance bottlenecks. I actually started down the path of building Recut in Electron and Rust (using NAPI-RS for bindings) and it looked promising, but I was still worried about the bloat of Electron.

A few months in, I took a closer look at Tauri, and ported the whole app from Electron in a week or so. Most of the heavy lifting was already in Rust, and the UI stuff pretty much "just worked". The biggest change was the bindings between JS and Rust.

Working with Tauri has been nice. I especially like their "State" system, which gives you an easy way to keep app-wide state on the Rust side, and inject specific parts of it into functions as-needed. I also really like how easy it is to write a Rust function and expose it to JS. The process model feels a lot easier to work with compared to Electron's split between renderer and main and preload, where you have to pay the cost of passing messages between them lest you ruin the security. Tauri's message-passing has a decent amount of overhead too, but I dealt with that by avoiding sending large amounts of data between JS <-> Rust and it's been fine.

The Tauri folks on Discord were a big help too (shout out to Fabian for the help when I ran into weird edge cases). I think Tauri has a bright future! Definitely worth a look if you know web tech and want to make cross-platform apps.




Awesome tool. I hope it gets a Linux release some day :)

As a DaVinci Resolve user I'm amazed at how good the Smooth Cut transition is at hiding cuts if the head moved very little during the cut portion of the video. It might be worth exploring more and seeing if it would make sense as an optional flag.


Maybe some day it will come to Linux! I think technically most of it would already work there, but some stuff requires native windowing API calls that I'll need to figure out and port over... plus all the install/distribution stuff... and the, uh, enhanced surface area for support haha. It would be awesome to get it running on Linux though.

I haven't played with Smooth Cut, I'll have to check that out. It sounds handy! Might be something I could just "turn on" in the XML file too, I'm not sure.


we linux users sometimes don't need any windows. Maybe you could also develop a cli version? Then something like this would be possible: recut input.mp4 --padding=123 --output=video output.mp4 btw, you created an impressive software. Very useful!


A CLI would be pretty cool, but personally I think much of the value of Recut comes from the live preview of what you're doing - having to export a 20 minute video only to re-export it again because the padding was too short is cumbersome to say the least.


Yeah, 100%, this is why I made Recut in the first place. The CLI approach would be great to support though and I can see some other really nice side effects of having that, like being able to batch-edit a whole folder of videos.


This is awesome! I need to help make some training videos in our company and this will help.

Coolest thing you’ve got is “remind me about this when I’m back at my computer”.

Genius CTA


Nice! I hope it helps. Let me know how it goes :)

It's funny, that CTA doesn't actually seem to get many signups, but multiple people have said how cool it is! So I've left it up even though it doesn't seem to "work" all that well. Today for instance, there were 3 signups out of ~7.3k unique visits!


Maybe increase the image size on mobile?


CTA means Call To Action, right?


Yep


I recently watched a video from a vlogger that cut her videos. I wondered why there wasn't a tool for morphing cut pieces of film out, so that you don't see the cut. So the transition in the cut piece needs to be generated with AI/deepfake.


I feel like I saw a similar thing recently, that generated the “missing” video given 2 end points. Might’ve even been still frames? It was impressive. Can’t remember where I saw it now, though.


George Lucas was doing it in the Star Wars prequals. I also noticed it happen in Barry in the last episode or two, when he kneels down in a desert, his face and body appear seamless, but his hair dissolves between two shots.


That's actually called a Morph Cut in most editing software.


Does it generate intermediary frames to help the blend, like those crazy Simpsons still frames?

https://imgur.com/43J7UYS, for instance?


Generally morph cuts will generate some artifacts.

Here's a video I did recently using a whole bunch of them (https://www.youtube.com/watch?v=pTDxbmQxS_0&t=25s)


That's vaguely disturbing probably because it reminds me of the way the death eaters appeared in the Harry Potter series when Voldemort called them.

I wonder if there's a way to change the colors or put an effect on the intermediary frames to add a pop or a flash of color to cover over the oddities?


Been using recut and it is amazing. The fact that this was rewritten in Rust and Tauri is impressive, and I think a very smart move. Once you know Rust, you tend (at least I do) to be very productive. Looking forward to new recut features.


Thanks Jeremy!


What are your thoughts on Swift vs Rust? I’ve used Rust and another engineer showed me guard let statements from Swift, which gave me the impression they had some of the same sensibilities.


Yeah they have some syntactical similarities, like `if let`, and no parentheses with if's and for's. I really miss `guard let` in Rust! (I've heard it's coming at some point though?)

Swift leans real hard into verbose method names, just like Objective C did, and Rust is pretty much the exact opposite there. When I was writing more Swift I got used to it, and actually started to like it. And now that I'm in Rust a lot, I like the brevity.

I think Rust nudges (forces?) me to write code that's architecturally better, with fewer interdependecies. It was hard to get used to though. In the beginning I kept trying to make structs that started threads, where the thread called methods on the struct, and that was just a recipe for big pain.

Swift, on the other hand, especially with the way the macOS and iOS frameworks are designed, relies a lot on MVC, delegates, and mutability, which gets hard to keep track of.


It’s interesting you say that Swift leans hard into verbosity - I’ve found the opposite. Most of the verbosity often feels like it exists due to ObjC “lineage”.

(I say this as someone who likes verbosity)


Interesting! Maybe we're using different definitions of verbosity too, or I used the wrong word there.

I was thinking more about the long method names which definitely do feel like they're carried over from ObjC. And then, a lot of those names are really more from the commonly-used frameworks than the language itself, so maybe it's not fair to say that "Swift" has those long names, but it does feel like the 99% use case for Swift is using those frameworks.

I think I agree that Swift felt like it needed less code to do a thing than ObjC would have, in a lot of cases.


There’s also SwiftUI.


They share a lot of similarities. Both make it hard to do unsafe things, both have functional influences, both have modern features like closures, optionals etc. I'd say the biggest philosophical difference between the two is that Swift leans more toward developer ergonomics while rust is geared toward system level programming (ie tighter control over memory etc).


Semi-relatedly, I am having a problem with my bluetooth headphones and audible. Apparently anytime there is more than a breaths of a pause, the module powersaves and then when it kicks back in you missed the first word or two of the sentence. Fine for some books, for others, it makes them very hard to listen to.

My only idea so far was: some sort of app that generates very quiet noise so that it won't power save?


Weird. Maybe one of those white noise generator apps? If the frequency were set super low so that it's below hearing range, maybe the headset would still think there's signal coming through.


If you're on an Apple device, you can make the phone itself generate white noise by going to: Settings > Accessibility > Audio/Visual (under Hearing) > Background Sounds


Unfortunately, I have a Pixel. I will try a white noise app though, good thought. Based on my reading it is the receiver doing the power saving, or whatever, not the phone, so this may work!


How about one that also removes "uh", "er", "hi guys", ads, and beginning stretches where "I" predominates?


So many potentially-interesting channels have lost me because two minutes into the first video I try, they're still talking about previous videos or irrelevant personal stuff or their subscriber count or WTF ever. Close tab.


You have sponsorblock[0] to at least skip the ads and the remainders to "like and share and subscribe". It works really well.

However, youtube premium might be the solution if you want to support your favorites channels without ads.

[0]https://addons.mozilla.org/en-US/firefox/addon/sponsorblock/


I think they are asking for the feature from the perspective of someone producing videos, not consuming.


No, they talked about removing "hi guys" and "ads", so that's definitely a consumer.


Descript probably fits the bill: https://www.youtube.com/watch?v=Bl9wqNe5J8U

HN had some good discussions last year with alternatives: https://news.ycombinator.com/item?id=25641205


You can use revoldiv.com to cut out filler words or any words of your choosing, after you upload your file and it finishes sound detection, you can click on the search box to bring up the toolbar to delete sounds


That looks nice! Is it your tool? I wonder how it's supporting free transcription since most of the good APIs for it are pay-per-minute.


Thanks yes it is, we implemented all the ai models in house, that cuts our cost.


Where would you point someone in the direction looking to build some image OCR models? Great work


That's awesome, nice work!


Super impressed with the launch, really well executed.

I can relate to part of your journey, I developed an iOS app in 2012, learnt just enough Objective-C to develop it.

The post popular request had been for an Android version, I redeveloped it using Xamarin a few years ago and have been very impressed with how close to the 'metal' you can get.


I don't know why the first thing I thought of is remake version of "Shunsuke Kida's - Maiden in Black", from the Demon Souls OST, having all the silence and therefore tension and negative space taken away from it. I know, that's different than youtuber videos; but that's where my headspace went.


What sort of data structures are you using for frames and the timeline? Do you have a frame type that you render to some canvas? How do you handle previews etc?

I've been mucking around with Rust and sixtyfps for some animation software, and I am curious how you've handled those basic things.

Best of luck in any case!


Lots of fun problems to solve here, haha.

For animation you could probably model it as an iterator that produces frames, where that iterator has a counter that increments with each frame, and use that counter as a measure of progress to figure out what should go in the frame. A frame would need a timestamp and maybe image data, or maybe just a data structure that describes what to draw.

Then to render, pull frames off the iterator, display each one, and wait enough time between them. Multiple tracks are like multiple iterators running in parallel, and then something to merge them. I'm using wgpu for the rendering, but sixtyfps might have its own thing?

Happy to chat more about this stuff! Email is in my profile.


I love this idea. I stumbled upon a Gist[0] from vivekhaldar[1] some time ago and it really helped out when I had to create a screen recording for a colleague. Definitely not as polished as Recut, though.

[0] https://gist.github.com/vivekhaldar/92368f35da2d8bb8f12734d8... [1] https://www.youtube.com/c/vivekhaldar


I use auto-editor for this https://github.com/WyattBlue/auto-editor

I'd like to be able to use komposition, which offered a bunch more nice features for screencast editing... https://github.com/owickstrom/komposition ...but it's bitrotted and isn't maintained any more.


Neat idea with using colors to edit!

My first stab at something before Recut was a Node script that did a similar sort of thing, just based on silence though, using ffmpeg's silencedetect to find silent parts and generating a cut list. And then as soon as it worked, I was like "well that's cool but I really want it to be interactive" and then... well it turned out that making a UI video editor was way harder than that script haha, but eventually Recut came to exist.


The main website video still shows what looks like a native Macos app. Any pictures/video of the new version? What UI toolkit (React?) and UI library (Material UI?) did you end up using with Tauri? Curious because this is something I'll be embarking on soon.


I need to update some screenshots! It looks very similar though, I tried to mimic the UI pretty closely. This video shows a demo on Windows: https://youtu.be/wuy-LKSE3y0

I’m using Svelte and Tailwind (not Tailwind UI) so the UI is custom, plus or minus some of Tailwind’s defaults.


Nice - I have no UI skills so will definitely use an existing lib. I know my limits. :-)

Did you notice any diff in UI speed with view being in JS/HTML/CSS? Response time seems nice in your video so I'm guessing negligible?


Nope, it’s fine, and maybe even faster actually. The slowest part is drawing the waveform and the (potentially thousands of) red silent areas on a <canvas>, and the Tauri app does better than the Swift app did there. The 2D canvas is GPU-accelerated so it’s pretty snappy even without doing a bunch of optimizing with dirty rectangles etc, whereas the Swift NSView isn’t hardware accelerated so it required a lot more hand optimizing and I think it’s still not ideal.

The big thing is to avoid blocking the UI thread, so if you’re calling into a Rust function that could take more than a couple milliseconds, marking that Rust function (aka Tauri command) as async will run it in a background thread and the UI won’t hiccup.


Thanks for the feedback - this is helpful


Did you notice any CPU/Memory advantages in switching to Tauri over Electron?


Memory usage seems to be lower with Tauri but I don’t have any hard numbers. CPU is better largely because of finding a way to draw video frames that involves less copying. It’s a (frankly, messy) hack that uses a native window/view positioned in the right place that I can render directly to as a GPU surface, which mayyybe could’ve been done with Electron with a lot of mucking around in Chromium internals, but Tauri makes it much easier to access platform APIs.

I know for certain that there are still performance gains to be had, but I’m also confident that they’re 100% in my control - Tauri and the WebView aren’t the bottleneck. That was one of my big fears with Electron – what if something is slow as molasses and it’s just stuck that way? I haven’t run into a wall like that with Tauri yet and at this point I don’t expect I will.


Is it possible for your app to take into account not just audio but also facial expressions? AKA do not cut out the parts where the speaker is silently making unusual faces or facial signals.


Possibly! Not currently, for sure. It'll have (again, soon) a feature that lets you manually override Recut's choices though, so you could select a meaningful silent part and leave it in.

I think at some point any sort of automation is bound to get something wrong and it'll likely never be perfect, so my goal is to add enough manual control that you're never just stuck with whatever the app decided.


That's cool, hopefully it can be a feature request you take seriously, it may not even be that hard I think to throw in a facial recognition library or something. The issue with doing it manually is that I would have to basically scan the entire video manually and that sort of counteracts the purpose.


Might not be as great as you seem to think, you'd be unflagging silence that you'd want cut anyway. I imagine you'd get false positive like with a speaker frowning and be silent for a few second because he's thinking.

Point is you'll still have to review manually the result anyway for under- and/or over-cuts.


The places where it's silent and making faces should be easier to review than the places where it's just silent.


Thank you for sharing so much of this development process over the years. It’s been fascinating to watch from a distance and I have to say, Tauri looks more and more appealing.


Thanks! And yeah Tauri is pretty nice, and seems to be rapidly improving.


This looks pretty cool. Might be too similar to https://jumpcutter.com/ though.


Do you have the video explainer in e.g. YouTube? It seems you use a provider that's blocked in my country (Vimeo?).


Oh that’s annoying, sorry about that. Try this one? https://youtu.be/wuy-LKSE3y0


Congrats on the launch! Do you mind sharing how you got your first few users and such great UGC / endorsements?


I put up a landing page early on in development and shared it on Twitter a time or two. Tweeted a few screenshots while I was working on it too.

On launch day there were a small handful of folks on the email list, and I tweeted about it, which somehow got picked up and retweeted by some big YouTubers. So that, some support by friends, and some luck.

I did have probably 6k Twitter followers at that point, but most of my audience is from teaching web dev stuff - blogging, a book, and courses around React - which has near-zero overlap with the audience of video creators. The happy comments are almost all ones I received over email and then got permission to share.


Great job identifying a niche and executing well on a focused feature set. That’s something I wish I could do better!


Can anyone here (maybe the author) say how fast this is? How long does it need for one video? Is it interactive even?


Hey, author here :) It varies by the length of the video and some other things, but on a 2019 Intel MBP it's loading up a 35min file in about 5 seconds. After that, it's interactive - you can adjust the silence-finding settings in real time, and seek around and hit Play like any other video editor, and it'll play back while skipping over the silent parts.

The slowest parts are (1) loading up the audio to find silence and (2) if you decide to export an MP4, encoding that. Exporting an XML timeline is near-instantaneous.


nice relaunch, i've been following recut for a while. I was once working on a web version of this all before I discovered recut

https://beta.jumpcutter.pro/


Thanks! That's cool. It's a good order of magnitude better than my first stab at this, which was a Node script + ffmpeg that spit out an EDL file, haha.


Which license is the app under? The trial installer didn't show me any EULA.


Disappointed that the author hasn't replied.

So, after installing the trial, I see that the app is dynamically linked to libav libraries. Which, for distribution, can only be licensed under LGPL or GPL v2.1 or 3, depending on configuration. This requires at the least that the libav source code used be made available. And if libav was compiled as GPL, then Recut would have to be GPL too. But I can't find any licensing info at the website or in the installer or the installed files.


Oh good call, I will remedy this.

It's using libav as you mention, and it's an LGPL build of libav.


Awesome! Did you consider compiling to WebAssembly and making it a webapp?


I've thought about it! We'll see.


this is great. Thanks for sharing. Just one question: what library did you use for processing videos?


Super cool!


For those of us on Linux, check out https://github.com/WyattBlue/auto-editor: "Auto-Editor is a command line application for automatically editing video and audio by analyzing a variety of methods, most notably audio loudness."


First thought: This is really great! Based on the amount of YouTubers that I see do this to their videos, I can imagine this would be a really great tool, and I think it's awesome that you made it, OP.

Second thought: Is anyone else really annoyed by the constant cuts in videos these days? I find it distracting at times and completely jarring in others. I've never made a video for consumption, so I could imagine there are a lot of "re-takes" + cutting out of "umms", but I just find it a bit sad that everything has to be SO clean-cut these days.


>Is anyone else really annoyed by the constant cuts in videos these days? [...], so I could imagine there are a lot of "re-takes" + cutting out of "umms"

I agree it's jarring but I understand why it often happens: it's easier and faster to fix a vocal mistake by backing up just a sentence or two and then re-record from there. So it's not always just removing the pauses; he/she actually fixed a speech mistake in that sentence and had to splice it in.

The alternative to avoid jarring cuts is to record longer takes of reading paragraph-length word counts without a mistake. This is much more difficult and time-consuming. So if you're trying to speak 10 good sentences and you flub sentence #10, you have to start all over at sentence #1 to maintain one continuous take. Otherwise, you'd have a jarring cut between sentence #9 and #10. E.g. you can see the bloopers outtakes at the end of each Technology Connections video to see that even reading from a script without mistakes is not easy.


Jump cuts, as they're called, can also be covered over by judicious use of b-roll and other supporting material.

But again, like longer reads, it takes more time/work. So a lot of channels just leave the raw jump cuts.


The other way to do this is to move to another camera angle which hides the cuts, although you certainly don't want to do this if the cuts are very short like the ones in the example on teh website. Recording 4k footage you can cut multiple HD "angles" out of it, eg zooms which you can use instead of multiple cameras.


And by pre-splicing with this software it is telling you where to put the b-roll in, saving loads of time!


Thanks! And yeah! It sounds super unnatural when even the tiniest pauses are cut out. Obvs I can’t prevent anyone from using it that way, but I set the defaults to leave a good 1/2 second of space on either side of each cut, and to leave in any silent chunks that are around 1/2sec or so.

Personally I think of the silence as a heuristic - it’s the gap between re-takes, so if I cut at those points and then delete the bad takes, it saves a ton of time.

This makes me think an interesting workflow to support might be something like, set the settings tight to get all the cuts, then delete the bad takes, then bring the pauses back.


I like the idea of cutting out silence, but editors need to understand that we need a bit of a gap in order to consume the content. I feel like I'm listening to a 10 year old with ADHD when you cut out literally any pause whatsoever.


As a person with ADHD: I get very easily distracted if the tempo of a video is low. I use the “Video Speed Controller” browser extension to run most videos at 1.5 to 2x speed.


Recut lets you configure that with a little slider


I've seen 30-second videos with a cut after each sentence. It's incredibly jarring.

If you can't give a 30-second spiel in one cut, then you keep trying until you do. Alternatively, you splice in some other graphic or video to hide when a cut happens.


30 seconds is too long for a shot. You lose the audience after 15 seconds, especially if it's just a talking head. Imagine you're standing in front of me as I tell you all this - you're not looking right at my face, looking me right in the eye for the 15 seconds or so it took to get to this point. You're looking away. You look over my shoulder at the thing behind me, you look at the editing equipment on the table, maybe look at what's on my screen. We're up around 25 seconds now, and your gaze has shifted at least half a dozen times.

In video editing you mimic this by cutting away to other angles, or to illustrative shots. Right around the end of the first sentence I cut to a longer "two-shot" showing us talking. At "to get to this point" I cut to a head-and-shoulders shot of you nodding in agreement (a "noddy shot", done after my piece to camera, getting you to look at the right height to match my eyeline). On "you're looking away" I cut back to me, and then a shot of my PC on the bench with some editing software open (bonus points for having it showing an earlier shot from this). On "and your gaze", it's back to me.

What you were actually looking at was cutting back and forth wildly, showing you something different every five to ten seconds, but somehow you didn't even see it move.


From what I’ve seen and experienced myself, there’s a definite learning curve to making videos.

In the beginning I remember it being maddeningly hard to even get 10 seconds out without messing it up. But then also, there’s a technique and a skill to getting the cuts to sound natural.

I’ve seen plenty of YouTubers who cut after every sentence and make it sound & look natural, but plenty more who cut the same amount and it looks jarring. Keeping your head in the same spot helps. Trying to speak one full thought at a time helps too.

The worst is when you get on a roll, get 30 seconds into your roll, and then completely lose it and can’t remember where to “roll back” to.


Alternatively, you just edit out the silences. What's the problem?


Reading your reply I immediately thought of a video I watched last week that would have been a good 10 minute video, probably a great 5 minute video, but was in reality a 25 minute "stream of consciousness" video. Every step in the process he described ~3 times, some of them were because he was waiting for a long running process to complete, some I think were just habit?

Much of YouTube is dealing with people who leave your video after a minute or two, so shorter is generally better.

I very much appreciate people who put the time into getting rid of the superfluous in their videos. One of my highest performing videos is a 26 second "how to" video, the majority of comments are "Thank you for not making this a 5 minute videos like the others on this topic". (Removing the riving knife from a DeWalt table saw, FYI).

I recently experimented going entirely the other way, probably went too much so. Can I teach using the Python Typer (CLI argument parsing) library in 60 seconds? Feedback from friends is "Man, that's DENSE!" https://youtu.be/1iO7wqnC7qw


They might be referring more to the tiny font size than the pacing of the video.


They didn't mention font size ("moved a little fast" was a comment for example), but I appreciated that you did. :-)


As did my early viewers for me :)


I don't know. I have ADHD and I can't watch a slow paced video (unless there is some sort of tension built in that gets brain cogs spinning). I often watch videos at 2x speed and I wish YouTube had an option for 4x.

If there are any pauses I quickly lose interest and go on doing something else and forgetting I even started watching something.


I understand small time creators with lots of cuts. They’re getting by on pretty small ad revenue and are amateurs not seasoned performers. The time and talent to have long takes without mistakes is hard, and I appreciate the content more than I want super polished production values.


>Is anyone else really annoyed by the constant cuts in videos these days I was initially but my brain has just accepted it as part of videos these days, I barely notice anymore unless it's really jarring, or done somewhere that doesn't make sense.


> but I just find it a bit sad that everything has to be SO clean-cut these days.

It's not "clean-cut" though. Jump cuts in pieces to camera look absolutely shite.

If you want to cut a bit out for pacing or to remove an "uhm uh <cough> so uh" then you cut away to something else. Maybe a close-up of what you're talking about, or to another camera angle.

Just chopping a bit out so you hop about the screen looks amateurish as all hell.


Yeah one subtle detail I really like about RedLetterMedia is that they'll have these cuts that last juuust a little longer than normal. It gives a nice punctuation to the video that feels almost classical in form.


The difference between them and most YouTubers is that they came at it backwards—they understood video production before they became YouTubers.


I’m old. It bothered me at first, but now I view it as a service to the listener. Somehow I got used to the abruptness fairly fast and then became addicted to the brevity.


Some really well done channels are unwatchable for me due to this in tangent with using an audio compressor wrong (attack set way too long) makes it even worse from pumping.


There is already an ffmpeg command line option for this, I thought. ffmpeg -af silenceremove=whatever.


[deleted]


This program looks like it fulfills its purpose quite nicely. Well done!

Unfortunately, that purpose happens to be something that absolutely drives me up the wall. Few things cause me to close a video and outright block its creator faster or more vigorously than cutting out pauses between sentences and phrases. It's great that your demo video recognizes that to be a problem, but even after accounting for it it's still jarring - and let's face it, just about zero users of your software are gonna account for it.

What happened to the good old days of doing multiple takes and rehearsing?

In any case, nice work on it, and I hope your customers use this power responsibly and unnoticeably :)


What happened to the good old days of doing multiple takes and rehearsing?

How well does that work on a live presentation before an audience, do you think? Say, a preacher delivering a Sunday morning sermon? PyGoSwiftCon 2023 tech presentation? "Here's last Saturday's video, can you post that to $WEBSITE?" One does not always have the luxury of a retake.


And in those contexts you can't exactly cut the pauses out, either. They're live; pauses and other "imperfections" are unavoidable, expected, and an intrinsic part of the performance - and if your next thought is "but what about the recording of it?", I can think of few things worse to do to the recording of a live performance than utterly butchering it for the sake of tiny pauses.

In any case, I said "and rehearsing"; people can and do rehearse live presentations and sermons and other speeches. That's in fact a very common thing: write out what you're going to say (or pay someone to write it for you), rehearse it in front of friends or family or pets or your mirror, possibly even memorize it.


> And in those contexts you can't exactly cut the pauses out, either. They're live; pauses and other "imperfections" are unavoidable, expected, and an intrinsic part of the performance - and if your next thought is "but what about the recording of it?", I can think of few things worse to do to the recording of a live performance than utterly butchering it for the sake of tiny pauses.

Doesn't seem any worse to me than watching at 2X speed, which I frequently elect to do.


Which is fair, but at least you get to control that. The same can't really be said of pauses between sentences/phrases; once they're gone, they're gone, you know? That is: playback speed is a non-destructive preference, whereas silence removal is destructive.


I'm thinking about using this for internal videos where I walk through code and stuff. Sometimes I need to take a second to think, you know?


Cheap and free one-liner to cut silent audio with ffmpeg

ffmpeg -i input.mkv -filter_complex "[0:a]silencedetect=n=-90dB:d=0.3[outa]" -map [outa] -f s16le -y /dev/null |& F='-aq 70 -v warning' perl -ne 'INIT { $ss=0; $se=0; } if (/silence_start: (\S+)/) { $ss=$1; $ctr+=1; printf "ffmpeg -nostdin -i input.mkv -ss %f -t %f $ENV{F} -y %03d.mkv\n", $se, ($ss-$se), $ctr; } if (/silence_end: (\S+)/) { $se=$1; } END { printf "ffmpeg -nostdin -i input.mkv -ss %f $ENV{F} -y %03d.mkv\n", $se, $ctr+1; }' | bash -x


This seems very long considering that there is a built-in filter "silencedetect". Does it only do the detection and then you have to do this Perl loop to do the cutting, parsing a custom format that silencedetects writes?


Yeah, silencedetect only finds the silent parts, it doesn't do any cutting.

My first attempt at this problem before Recut was basically this, but in Node, and creating an EDL file.

It works fine but the feedback loop is annoyingly long - run the script, import the result into an editor, listen back, realize the silence threshold was too high, try again... so that's what drove me to make an interactive version.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: