Hacker News new | past | comments | ask | show | jobs | submit login

We agree that ingestion and extraction are a big part of the problem for building high quality RAG.

We've talked to a lot of different developers about these problems and haven't found a general consensus on what features are needed, so we are still evaluating advanced approaches.

For now our implementation is more general and designed to work across a variety of documents. R2R was designed to be very easy to override with your own custom parsing logic for these reasons.

Lastly, we have been focusing a lot of our effort on knowledge graphs since they provide an alternative way to enhance RAG systems. We are training our own model for triples extraction that will combine with the automatic knowledge graph construction in R2R. We are planning to release this in the coming weeks and are currently looking for beta testers [we made a signup form, here - https://forms.gle/g9x3pLpqx2kCPmcg6 for anyone interested]




I'm actually curious what the common patterns for RAG have been. I see a lot of progress in tooling but I have seen relatively few use cases or practical architectures documented.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: