Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anthropic found a similar result for retrieval: embeddings + BM25 keyword search (variant of TF-IDF) produced significantly better results.

https://www.anthropic.com/engineering/contextual-retrieval

They also found improvements from augmenting the chunks with Haiku by having it add a summary based on extra context.

That seems to benefit both the keyword search and the embeddings by acting as keyword expansion. (Though it's unclear to me if they tried actual keyword expansion and how that would fare.)

---

Anyway what stands out to me most here is what a Rube Goldberg machine it is. Embeddings, keywords, fusion, contextual augmentation, reranking... each adding marginal gains.

But then the whole thing somehow works really well together (~1% fail rate on most benchmarks. Worse for code retrieval.)

I have to wonder how this would look if it wasn't a bunch of existing solutions taped together, but actually a full integrated system.





Thanks for sharing! I am working on a rag engine and that document provides great guidance.

And, agreed, each individual technique seems marginal but they really add up. What seems to be missing is some automated layer that determines the best way to chunk documents into embeddings. My use case is mostly normalized mostly technical documents so I have a pretty clear idea of how to chunk to preserve semantics. But I imagine that for generalized documents it is a lot trickier.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: