An alternative to using old school NLP, is to use GPT itself for the first pipeline as well, with a prompt like: I've the following resources with data. power_troubleshooting.txt contains information for customers that have issues powering on the device, (and so forth in the next lines... with other resources). This is the user question: ..., please reply with what is the resource I should access.
Then you get the file and create a second prompt. Based on the following information: ..., answer this question: ... question ...
A slower and more powerful way involves showing GPT different parts of potentially relevant text (for instance 3 each time) and ask it to score from 0 to 10 the level of usefulness of the resource in order to reply to the question. And make it select what resource to use. But this requires a lot of back and forth.
Just be aware that your pipeline prompt should not contain any secrets and you should expect that users will be able to subvert your pipeline prompt! I think the most popular name for these attacks is currently 'prompt injection'.
Then you get the file and create a second prompt. Based on the following information: ..., answer this question: ... question ...
A slower and more powerful way involves showing GPT different parts of potentially relevant text (for instance 3 each time) and ask it to score from 0 to 10 the level of usefulness of the resource in order to reply to the question. And make it select what resource to use. But this requires a lot of back and forth.