Hacker News new | past | comments | ask | show | jobs | submit login

“ What were the UK's top exports in 2023?"

"List all YC founders that worked at Google and now have an AI startup."

How to check the accuracy of the answers? Is there some kind of a detailed trace of how the answer was generated?




great question, I can talk about how we do the more challenging "List all YC founders that worked at Google and now have an AI startup."

For this we have a target dataset (the YC co directory) that we have around 100 questions over. We have found that when feeding an entire company listing in along with a single question we can get an accurate single answer (needle in haystack problem).

So to build our evaluation dataset we feed each question with each sample into the cheapest LLM we can find that reliably handles the job. We then aggregate the results.

This is not perfect but it allows us to have a way to benchmark our knowledge graph construction and querying strategy so that we can tune the system ourselves.


OK, so you have a way to evaluate the accuracy and convince yourself that it’s probably works as expected. But what about me, a user? How can I check that the question I asked was answered correctly?


I think there's no substitute for doing your own research and comparing the results.


I just want to avoid putting one black box on top of another if possible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: