There are already solutions to this kind of problem. Using embeddings to store s...

fzliu · on March 12, 2023

Would like to bring up LangChain as well : https://langchain.readthedocs.io/en/latest/. We recently integrated Milvus (https://milvus.io) into LangChain, so you'll be able to store and process billions of documents.

bishes · on March 12, 2023

I used Haystack due to the readily available colab notebook[1] for their tutorials. I wanted to feed my own text corpus to it, and that was the fastest way available.

Langchain docs are helpful, and it would be even better if you published an end-to-end notebook using a popular dataset. Definitely looking forward to try langchain as I dive deeper into this.

1. https://haystack.deepset.ai/blog/how-to-build-a-semantic-sea...

jerpint · on March 12, 2023

You can checkout our library too which does just that :)

https://github.com/jerpint/buster

emptysongglass · on March 12, 2023

Is there documentation for feeding your own documentation into it?

jerpint · on March 15, 2023

We will be adding that soon

netsroht · on March 12, 2023

I use this method to answer questions about historic and real-time social media comments.

https://foretale.io/toolbox/Social_Media_QA