I'm a hobbyist in the field, but I concur and fell into that trap. I wanted to i...

warangal · on April 13, 2023

I think, it is really underrated how much faster numpy approach is for low-key workloads like personal images and videos where you would generally have a few K embeddings and it is also possible to hide some latency by sharding and stream results as soon as a shard is done. We work on a similar project[0] to make it easy to index personal images and videos. Before using numpy, we also looked at libraries like Annoy to index the embeddings, but being approximate sometimes such libraries would leave out most-similiar image/frame. For personal data, we found it is better to depend on exact-search rather than an approximate one even if it comes with a speed tradeoff.

[0]: https://github.com/ramanlabs-in/hachi

mft_ · on April 13, 2023

I've been playing with Hachi this afternoon, as it's a more developed version of something I've been hacking around with on and off for a while, and there's little point is duplicating the effort. Thanks for the release - first impressions are very positive!

There are a couple of things I'd like to discuss/suggest, and also FWIW I get "Face Recognition not available" when I try that feature.

Should I use Github, or is there a better way to contact you?

warangal · on April 13, 2023

Hey, thanks for checking it out. We also started working on it for personal reasons. Any suggestion or feedback is welcome, you can contact me directly at anubhav@ramanlabs.in .

Some features like Face-recognition and improved video-search are available as premium features for now but working on to let users preview those features through a demo.

disqard · on April 13, 2023

Thank you for building and sharing this!

Have you tested this on a large photo/video collection? How does it perform with (say) 20k images?

Excited to try this out for my own needs, as long as it can handle real-world usage.