Hacker News new | past | comments | ask | show | jobs | submit login
30 years of family videos in an AI archive (blog.google)
50 points by yarapavan on July 28, 2020 | hide | past | favorite | 14 comments



As someone literally doing the same thing right now, I feel like the author is missing the point. Perhaps she's too young for the full impact of these tapes to hit.

Yes there are some pretty cool things she described. Finding significant scene transitions, doing audio transcription, etc. can help enhance the project. But I take issue with this statement: "You can only watch old family friends open Christmas gifts so many times"

I've now digitized probably over 50 VHS tapes with more to go. What's particularly striking to me is how old cameras were used. They were used for long continuous captures because they weren't "on you," were ergonomically suited towards it, had no expectation of sharing/editing, and were such a hassle to pull it out you really had to want to use it. That means that unlike today, you might get 30 minutes straight of that "opening of Christmas gifts" and the banal conversation that follows. It really is a snapshot in time who's length lets you fully appreciate the era and the people, along with their essence and personalities.

I'm guessing that most people in the author's life are still alive. She's young enough to not be as sentimental. But as you get older and many important people pass, watching the entirety of this archive is of great value. It's almost like the people in them are still alive by capturing the plainness of the moments instead of just the exceptional.

I wonder if she digitized all those hours at once, or had a service do it for her. If so, then receiving such a large chunk might feel daunting. I've written my own software to automate this with a high end VHS player, but the captures still occur in real-time. It's been my quarantine project. And as such, I've watched far more of this archive than I originally ever thought I would because of the forced serial nature. Each tape brings its own bag of wonderful surprises. And my family couldn't be any more appreciative... everyone watches nearly every video as I send them out, and then shares comments back by email. It's been an amazing gift to my entire family from the past.

I wonder what she plans to use all that machine learning for in the end. Is she really going to be searching for such moments on a frequent basis? I'm guessing not. To me it's not a search problem -- it's a long discovery process about how your own life has unfolded and changed over time.


"You can only watch old family friends open Christmas gifts so many times"

It is unbelievable sad and tragic how many people realize how untrue this is until it's too late.


I've written my own software to automate this with a high end VHS player, but the captures still occur in real-time.

This sounds very interesting - what VHS player are you using, and what is your workflow?

If you don't have a writeup somewhere, would you consider publishing one? I think many people would find it useful and interesting.


Panasonic AG-5710. It has an RS232 port meant to be driven by a hardware controller for professional video editing. I managed to acquire the manual for it, which documented the protocol it speaks -- something that would never happen today, sadly. I wrote a driver in Python for it that I'll probably open source at some point. From there I wrote a Mac app to drive the VCR, preview the output (useful for inspecting unlabeled tapes before capture), perform the capture, detect end-of-material to then stop playback & eject the tape, and fully automate the post-processing. It's been several months in the making with tweaking here and there. Next up is automating the generation of a webpage with the library, and maybe even generate the emails I send out (although not sure if I can interface easily with Dropbox to get the public link that I send). I originally went down the Linux/Windows route with different capture hardware, but both proved to be horrible development experiences with unstable drivers that would crash my system, have audio stop working, or some other kind of misc problem. It's too bad because I have a much fancier video card in my linux box than my 2012 MBP possesses for hardware-accelerated encoding. That said, doing the extra trimming/splitting operations in QuickTime Player has been quite easy in a way that I don't think Linux/Windows would have matched as well, so I'm happy with my solution.

I probably will do a write-up at some point (and would love to open source it), but I might have to clear this with my employer first.


Thanks for the info! Yes, a writeup, and open-sourcing the Python driver you wrote would be excellent. We've got tons of tapes still unconverted, and as another commenter mentioned, it's better to do it sooner than later.

On an MBP 2012 as well, and also have a powerful Linux desktop for ML work, so definitely sympathise with the divide in hardware. Interesting that you're using Quicktime - is ffmpeg not suitable because you need a GUI for the trimming/splitting?


I use ffmpeg to do some automatic trimming and to force encode the video as top-field encoded. My capture hardware lies and says it’s deinterlaced, when the video is not. I’ve tried deinterlacing with it but found best compromise quality / effort / speed wise using Apple’s Compressor and it’s deinterlacing algorithm. I don’t know why I’m unable to override the capture in AVFoundation; maybe someone smarter than me can tell me how to do this (I suspect it’s not possible).

I use QuickTime to go through the material and do the splits / manual trimming. This happens often as one tape was used for multiple events and sometimes TV content as well.


Hi, I'm pretty interested in what you're doing here and have built similar tools for other domains. Ping me to talk offline? --> cinjon@nyu.edu.


Any recommendations for a good VCR to digitize footage? I have a VTG Sony SLV-778HF laying around that I'm planning to use.


Panasonic AG-5710. Anything with a time-based corrector (TBC), or the ability to drive one (although external TBCs are expensive). Don't know much about that model, but the TBC is really key in getting good captures.


That was very cool. My Dad who will be 99 in a month, had a video camera as a teenager (his father was a minister in Iowa, and there was basically no money for things unless my Dad worked for it - as a teenager he made nitroglycerine and blew up tree stumps for farmers; a major profit center for him and paid for a photography hobby).

About 7 years ago my Dad digitized all the video he had, including childhood material of his family in Iowa, my brother and I, family friends, etc. He distilled this all down to two 90 minute videos that my Mom narrated. A huge effort but much appreciated.

I have always appreciated search on Google photos. My daughter and son in law lost all their digital photo assets from a crashed computer and unusable backups. It took me about 10 minutes with my cellphone to restore a lot of pictures for them by doing search on Google photos of just first name for each of our grand kids, then daughter, then son in law, then the name of their old dog. That created 5 photo albums that I shared with them and that they downloaded to multiple devices, etc.

Whenever I take a picture as soon as I am home and on wifi my cellphone uploads new pictures to Google, Apple, Microsoft OneDrive photos. I only rarely backup everything to local USB drives.

I would like a custom system like Dale wrote about. I have almost 7 years of deep learning experience, and I am now retired so there is no real excuse except that I have a silly project RecipeGAN where I am trying to generate recipes and someone who used to work for me challenged me to also generate pictures in addition to an ingredient list.

It is an amazing world when people like the article author (Dale) and I can do stuff like this as a hobby. Awesome, really.


The post links to the authors technical write-up that's a little less of a simple ad for Google Vision: https://daleonai.com/building-an-ai-powered-searchable-video...


Oh my GOD, this is incredible. This NEEDS a GUI wrapper. Even just being able to search a keyword ("wedding") and getting a list of locations ("File 22, 0:52:36 - 1:36:22") to guide a search would be so huge. Depending on the situation, I'm not sure there's a price I WOULDN'T pay to let my parents do this.

I guess I found my next project. Wow. THIS is valuable.



Really cute. Makes me a little misty knowing that my little ones are going to grow up so fast. They're younger than the girl in this video. Life is so short.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: