Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: You-tldr – easy-to-read transcripts of Youtube videos (you-tldr.com)
187 points by bilater on Feb 6, 2021 | hide | past | favorite | 72 comments


I did a similar (in spirit, to save time) thing [1] to be able to skim on technical presentations: it creates a static HTML page with regular shots from the video with the corresponding Youtube CC on the side. It can help decide if a presentation is worth a watch (or just get the gist of one). Uses youtube-dl and ffmpeg under the hood.

[1]: https://github.com/rberenguel/glancer


Never touched Haskell before, but found that path separators were needed to get this working (on WSL with Ubuntu if it matters)

For example

    -  capsPath <- getFullPath (T.unpack (coerce dir <> coerce videoName <> ".en.vtt"))
    +  capsPath <- getFullPath (T.unpack (coerce dir <> "/" <> coerce videoName <> ".en.vtt"))

Needed to apply the same change in a few places.

Also found

    --sub-langs en
doesn't work if the videos has for example en-CA as the subtitle language.

And finally, youtube-dlc seems to inconveniently discard the VTT file, breaking glancer, unless you specify `-k` to keep the file.


Paths: will fix, since I was the only user I didn't hit any issues

Subs: that's kind of known, I need to provide the option to the user to choose sub language in weird cases. I'll try to find a way.

VTT file: oh, that's definitely new, is not the case with the version I have installed. I'll have a look ASAP and add this parameter.


To take your idea even further, one of the ideas i've been batting around with is to create some sort of learning algo which runs through a video or song, skips forward and back a set number of times to random timestamps to decide if that video or song is something I would be interested in based on my listening history.

I feel this is how I currently approach an unfamiliar artist or creator - skip ahead to random points in the video and if I still find them engaging, I rewatch from 0. This type of AI would be so great.


There's no AI in my program, it's as simple as it sounds: get regular screenshots from the video, collate the corresponding closed captions. One natural improvement of this, though, using some ML would be to focus on the slides when there is a slide+banner+floating head type of presentation. That wouldn't be terribly hard to implement in Python, in Haskell would get me ages (there's Hasktorch, but I havent' tried it, also I have many more years of Python behind me).

Your idea sounds intriguing, though. I wonder how one could measure _interest_ in this regard, some form of entropy measure might be right, but how to construct it would be the fun part.


Wow this is going to change the way I use YouTube. Huge thank you. I might even rewrite it in Rust!


Thanks! There's nothing here that required Haskell, it's just that I'm writing all my new "custom tools" in it. Once you get the hang of some things (parsing, running external commands, command line parsing), it's very easy to build a new tool. I imagine it would be the same with Rust (if I'm not wrong there's a very good parser combinator library as well). Ping me when you are done, my Rust is very basic but since I'd know what it does I would learn from it.


The README mentions that you have a list of technical talks you are meaning to watch. Would you mind sharing that list? I'm curious!


Any way I can reach you? Don't want to spam an answer with a very long list. The summary would be "a few from" (around 7-15) the last Flink Forward, Ray summit, Spark Summit (the last 2 actually) and then some other more "random" talks in areas I'm interested in. That is the bulk of it. I also have now a section of "non-glanceable", for talks where there is more than the slides (like, the Play track of last Github Universe, or the recent Bill Evans documentary shared here in HN).


You could upload it as a gist on Github for example, that way others could see it. If not, my email is jarbus@tutanota.com. Thanks!


Not thinking of a gist is what you get after 8 days without opening a computer, here it is: https://gist.github.com/rberenguel/dd6a84927b8f367e9b68e9397...


Thank you so much!


I wonder if the normal `youtube-dl` tool still has problems downloading the auto-generated subtitles as of today? Do you know?

I'm not keen on using any of its forks for the moment.


At least the official version that is currently in Debian sid/unstable is having problems downloading vtt:

  apt-cache policy youtube-dl
  youtube-dl:
    Installed: 2021.01.08-1
    Candidate: 2021.02.04.1-1

Combining youtube-dl with vtt2txt.py[1] would usually give me a transcript, however this no longer works for me (for now):

  $> youtube-dl --skip-download --convert-subs vtt https://www.youtube.com/watch?v=iGLzWdT7vGc

  $> find . -name "*.vtt" -exec python vtt2text.py {} \;

[1] https://gist.github.com/glasslion/b2fcad16bc8a9630dbd7a945ab...


Haven't checked since I wrote glancer (it's a pretty recent project), I expect the normal one to eventually fix its issues, then I'd move to it. Better not add more complications (you already need stack, that's kind of askinga lot).


Wow, thanks so much for this! Have to try this in Termux on my ereader (Onyx Books Nova 3). Love the idea of comfortably reading all these tech videos on eink instead of sitting in front of a normal display even longer.


I hope it works well. The generated HTML is _large_, since I embed the images as base64 encoded (to make it braindead easy to share the "presentation"), that's the only point I can imagine an ereader having issues, but Onyx devices should have enough power to handle that.


Wow, I am going to use hell out of this. Such a basic idea when you think about it, but a ton of value.


Thanks, when I was done I kind of thought the same. In the end, having the slides would be almost equivalent, but combining a decent enough transcript with the slides adds the minimum "ok, got it" to go from having to watch the video to just being able to skim over it.


I did something similar for myself a while back, to put the transcript in a text file. It's a five line bash script that uses youtube-dl to get the closed captions and cleans up the formatting.

  #!/bin/bash
  link="$1"
  fn="captions"
  youtube-dl --output $fn.%(ext)s --write-auto-sub --skip-download $link
  sed '/-->/d' $fn.en.vtt | sed '/<c>/d' | sed '/^[[:space:]]\*$/d' | uniq > $fn.txt


When I first saw this project, I was confused as to how it wasn't any different than this feature. To me it seems like they just wrapped youtube-dl to only extract subtitles, made it a webpage, and called it a service.


> made it a webpage, and called it a service.

That is added value for many people.

Some people will google the answer to their question, land on this webpage, read the transcript for this one video that caught their interest in the first place, and be done with their task.


This is brilliant. I was wondering how this tool worked.


If you're looking to make money / provide more value with this service I think the angle you should try for is Video to Blog Post.

The transcript is fine but I'm not quite sure what problem this solves. Whereas if you were able to take the transcript and spit out a file that was broken into sections (based on YT chapters) and with an attempt to automatically clean up the grammar and remove the "um" style fill words of the spoken version I think this would hit in a very different way.


Thanks for the suggestion! That's a cool idea - will consider putting it on the roadmap :)


+1000 I've been trying to do this (split into sections and eliminate umms) with recordings of live classes... it takes me 4 hours per each hour of recording.

if you can make this process faster the world will throw money at your feet.. at least I will


You should try Descript[1]. One of its features is automatic filer word removal. I'm not affiliated, just a happy user.

1: https://www.descript.com/filler-words


Second this. One thing I find useful in quarantine is to extract a cook recipe from video transcripts. What I do now is opening the youtube transcripts (sometimes unavailable), pasting it into Notion and then hand-typing the sections.


I tried to transcribe https://www.youtube.com/watch?v=DLzxrzFCyOs but subtitles are disabled for it.

So I guess that means this service relies on YouTube provided subtitles.

Sad times that it didn’t work.

It would have been so great to link someone to a transcript of that particular video ;)


Haha good one! Yes we do rely on youtube's subtitles. But are considering adding our own speech to text feature for the videos that don't have subtitles if there is demand ;)


I built a very similar thing w/ topica.io (now defunct - I could spin it back up if there's interest), but focused on sentence-level and word-level timings to create a sleek interactive transcript[0], and transcribed the videos via third party in the background. My email's in my bio if you want to connect :)

[0] https://www.3playmedia.com/resources/recorded-webinars/wbnr-...


I built something related called https://sidenote.me to take notes on YouTube videos, so I understand what you're trying to solve. Congrats on your launch, your interface looks very sleek!

My feedback about the product itself is that it's trying to do too much too soon. For example, the set interval slider, and all features in the "Pro Toolbar": are they really useful and necessary to your users? To me it seems they're not, and add confusion.

So the question you should ask yourself is: what is the one thing you want to solve for your users, and then make the interface do only this one thing and do it well. Only once you grow a user base you should add new features.

Best of luck with your project!


Thanks for the feedback! I get what you're saying. What is the 1 or 2 features that you think are the most valuable?


Text searchability for a video's audio and visual content.


How is this different to "Open Transcript" on YouTube?


...I did not know about this. You are my hero.


Not so impressed, to be honest.

1) I was curious how their ASR software was going to be able to be fast enough for a longer video, and how the WER would be. To make it easy on them, I tried it first with a video that has no background noise. Well, bummer: all they do is scrape the closed captions -- the video I had chosen happened to have no CCs, so it didn't work at all. :-(

2) Okay, I thought, fair enough, let's try their summarization technology. So I found another video that did have closed captions. Clicked on the summary tab: "No summary generated."

So then what is this other than an automatic extractor for CCs?


This is useful. Every so often I see YouTube videos which seem appealing but are egregiously long.

I'm talking about the "Guy talks at camera" videos which often exceed 40 minutes.

Jeff Geerling or Contrapoints are huge exceptions to this because their content is engaging, well thought out and not "ranty" like the videos I'm talking about.

Maybe I have a low attention span, but at some point I'm going to be incredibly sick of this monologuing, just distill this. Please.


> This is useful.

Have you tried it? I have the same problem as you, and this app seems to just show transcripts, which is already available on the YouTube website. This app doesn't do anything for me.


What's the difference between this app and the built-in interactive transcript feature that youtube already has for videos with captions?


Off-topic but notice you’re a Loom user from the demo video you created. Just wanted to say thank you for recording with us! (co-founder)


You guys hiring any interns by any chance? Figured I might as well shoot my shot


For the first time ever, yes! We read all applications:

https://jobs.lever.co/useloom/27c2ffa6-6938-4844-9835-6711f7...


Loom is awesome! Thank you for providing such a great service! :)


Clicking on one of Pro functions displays classic alert() dialog and loads "Upgrade" page, losing the loaded video and transcript. You might want to use a custom modal with link to "Upgrade" page which switches only if user clicks on it, and maybe even opens in new tab/page, to keep state.


Gotcha - yes I've been meaning to change that since the behavior right now is annoying for the user.


Shouldn't it be "You-tldw"? The point is to read, rather than spend the time watching.


The pro version additionally will summarize the transcript also according to the feature page. So in case even the transcript itself needs a tldr this has you covered. Besides I would guess that tldr is a more well known acronym than tldw.


The TL;DR crowd won’t read a half page “wall of text” on a topic, but seems happy to watch 30 minute videos on how to find a loot chest.

The TL;DW crowd would rather take 30 seconds to read the half page it might take to explain that.

So both TL;DR and TL;DW are useful.


Wow the transcription looks better than the automated one rev.com provided for a video where the speaker had a bit of an accent. Very cool - what transcription software is it using?


Youtube!


Nice tool. I tried on mobile. I was able to get a transcript but I ended up in the subscription page by mistake 3 times so I had to try again each time.

It seems to me that the free version does not allow me to do enough to make me used to your tool and to end up subscribing to the pro version.


You can sign up and try for 7 days for free and get all the pro features! :)


I made a Video to PDF tool a waze back. I made it for dance and workout videos - so I didn't have to remember the many moves later on.

https://www.youtube.com/watch?v=C-7JCg5fGho


Nice, it's working on mobile Android. The transcript letters could be a bit smaller and the side margins also. An option to keep the video as sticky while scrolling the transcript could also be useful.


Thanks for the feedback!


I was thinking about making something like this (with the same name!), but my take was going to be a bit more manual with the use case of getting the 10s answer out for a 10m clickbait video.


Transcribe this https://youtu.be/r7SO-Oq3d5E ;) Good work tho


Unfortunately for me, I tried with 4 videos and all got:

> The video is no longer available. If this is an unexpected error, please try again in a bit.


Sorry about that! Sometimes you need to refresh or wait a minute for it work.


Works impressive! Nice work building it.


Hey - sorry about that! Can you try again? Sometimes there are errors that good old refreshing fixes!


It started working. Very neat!


If you would like to summarize I suggest to add an on-demand generated word cloud of the transcript


I built a similar thing but I used a NN to add punctuation to the output subtitles.


Is that open-source, or do you have a reference to one that might be? I've heard of punctuator, but keen to see what else people are using.


Amazing product! Isn't FB planning sth like that?


Nice work


I have too many videos to watch, and this tool sounded like it could help. But it just shows the transcript, so this is not the TLDR. Maybe its meant to be smarter, but it said "No Summary Generated". It didn't help me, as I already skim the transcript on YouTube (its in the meatball menu next to "SAVE")


Obligatory side note of https://tldr.sh a fantastic tool about command line tools you might not even have installed!

Also a bit of dad sarcasm, shouldn't it be You-tldw?


Thank you!


This name makes no sense. If you're reading the transcript for a Youtube video instead of, y'know, watching the video, then you can't call the transcript a TLDR. At best, it's a TLDW, though it has accessibility applications for people who also intend to watch the video.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: