Hacker News new | past | comments | ask | show | jobs | submit login

Any recommendations for an embedded embedding database (heh)? Embedded, as in sqlite. For smaller-scale problems, but hopefully more convenient than say, LMDB + FAISS.



You can take a look at txtai (https://github.com/neuml/txtai). It can run in a Python process. It has support for storing content in SQLite and embeddings vectors in local vector index formats (Faiss, HNSW, Annoy).

Disclaimer: I'm the primary author of txtai.


I actually just finished a POC using DuckDB that does similarity search for HN comments.

https://github.com/patricktrainer/hackernews-comment-search


FWIW, Simon Willison's `llm` tool just uses SQLite plus a few UDFs. The simplicity of that approach is appealing to me but I don't have a good sense of when+why it becomes insufficient.


Thanks, I'll check it out.


For Python I believe Chroma [1] can be used embedded.

For Go I recently started building chromem-go, inspired by the Chroma interface: https://github.com/philippgille/chromem-go

It's neither advanced nor for scale yet, but the RAG demo works.

[1] https://github.com/chroma-core/chroma



awesome, how did I not find this :D


The qdrant clients support a local mode where you point to a file[0].

[0] -- https://github.com/qdrant/qdrant-client#local-mode


I've used usearch successfully for a small project: https://github.com/unum-cloud/usearch/


Qdrant as a library project https://github.com/tyrchen/qdrant-lib




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: