On a side note, is there an open source RAG library that's not bound to a rising AI startup? I couldn't find one and I have a simple in-house implementation that I'd like to replace with something more people use.
You can have a look at Langroid[1], a multi-agent LLM framework from ex-CMU/UW-Madison researchers, in production-use at companies (some have publicly endorsed us). RAG is just one of the features, and we have a clean, transparent implementation in a single file, intended for clarity and extensibility. It has some state of the art retrieval techniques, and can be easily extended to add others. In the DocChatAgent the top level method for RAG is answer_from_docs , here's the rough pseudocode:
answer_from_docs(query):
extracts = get_relevant_extracts(query):
passages = get_relevant_chunks(query):
p1 = get_semantic_search_results(query) # semantic/dense retrieval + learned sparse
p2 = get_similar_chunks_bm25(query) # lexical/sparse
p3 = get_fuzzy_matches(query) # lexical/sparse
p = rerank(p1 + p2 + p3) # rerank for lost-in-middle, diversity, relevance
return p
# use LLM to get verbatim relevant portions of passages if any
extracts = get_verbatim_extracts(passages)
return extracts
# use LLM to get final answer from query augmented with extracts
return get_summary_answer(query, extracts)