Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are already solutions to this kind of problem. Using embeddings to store semantic meaning -> query the vector database with a question -> use extractive q/a models to get relevant context -> using a Reader model to generate answers based on the context from the document.

just checkout Haystack tutorials. I started looking into it after getting introduced to the concept by articles mentioning OpenAI embeddings and GPT 3 api, but it can be done using open source models.



Would like to bring up LangChain as well : https://langchain.readthedocs.io/en/latest/. We recently integrated Milvus (https://milvus.io) into LangChain, so you'll be able to store and process billions of documents.


I used Haystack due to the readily available colab notebook[1] for their tutorials. I wanted to feed my own text corpus to it, and that was the fastest way available.

Langchain docs are helpful, and it would be even better if you published an end-to-end notebook using a popular dataset. Definitely looking forward to try langchain as I dive deeper into this.

1. https://haystack.deepset.ai/blog/how-to-build-a-semantic-sea...


You can checkout our library too which does just that :)

https://github.com/jerpint/buster


Is there documentation for feeding your own documentation into it?


We will be adding that soon


I use this method to answer questions about historic and real-time social media comments.

https://foretale.io/toolbox/Social_Media_QA




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: