What are the implications of having questions and answers in the sources when using RAG pattern using azure openai

Question

All,

We are using openai RAG pattern to retrieve sources from a document and sending them to openai to get answers. We are using azure cognitive search as the vector db.

we are generating a Json file from the original pdf. Our client also provided a list of expected questions and answer.

We indexed the pdf json along with the questions and answers.

In our testing we are getting the expected results along with content from both the pdf json and questions and answers.

Questions

what are the implications of this approach?
are we introducing any bias?
is there a better way to do this ?

Thanks -Nen

When using RAG with your own data, the answers will be based on the sources you provide. In that sense they can always be biased, if your documents are biased. The idea of answering questions with your own data is to restrict the answers to the owned data vs. what is on the internet. If your client is adding exactly the data they need the agent to answer in the source files, then you're accomplishing their business need. I hope this helps. I am not adding this as a response to this question, since it might require more clarification. — Gia Mondragon - MSFT, Aug 29 '23 at 16:56

What are the implications of having questions and answers in the sources when using RAG pattern using azure openai

0 Answers0