I'm a hobbyist in the field, but I concur and fell into that trap.
I wanted to index my photos with some CLIP, and my complexity mind said that I wanted an under-linear database, because that's what databases are for.
I took a bit of time around spotify's annoy for that. Then the first results were meh, and I was like "oh frack how do I know whether the issue is in the approximate NN or the CLIP model?".
Then I realized I was worried about linear complexity for a 5MB database, laughed, rewrote into a dumb for loop in a jiffy, and could conclude within the hour.
I think, it is really underrated how much faster numpy approach is for low-key workloads like personal images and videos where you would generally have a few K embeddings and it is also possible to hide some latency by sharding and stream results as soon as a shard is done. We work on a similar project[0] to make it easy to index personal images and videos. Before using numpy, we also looked at libraries like Annoy to index the embeddings, but being approximate sometimes such libraries would leave out most-similiar image/frame. For personal data, we found it is better to depend on exact-search rather than an approximate one even if it comes with a speed tradeoff.
I've been playing with Hachi this afternoon, as it's a more developed version of something I've been hacking around with on and off for a while, and there's little point is duplicating the effort. Thanks for the release - first impressions are very positive!
There are a couple of things I'd like to discuss/suggest, and also FWIW I get "Face Recognition not available" when I try that feature.
Should I use Github, or is there a better way to contact you?
Hey, thanks for checking it out.
We also started working on it for personal reasons. Any suggestion or feedback is welcome, you can contact me directly at anubhav@ramanlabs.in .
Some features like Face-recognition and improved video-search are available as premium features for now but working on to let users preview those features through a demo.
I wanted to index my photos with some CLIP, and my complexity mind said that I wanted an under-linear database, because that's what databases are for. I took a bit of time around spotify's annoy for that. Then the first results were meh, and I was like "oh frack how do I know whether the issue is in the approximate NN or the CLIP model?". Then I realized I was worried about linear complexity for a 5MB database, laughed, rewrote into a dumb for loop in a jiffy, and could conclude within the hour.