0

I have to train gpt-3 on email data, so that the support team can get a quick answer from a chat-bot, for questions that were asked before by customers. There are email conversations between customers and the support team (Customer1 ask question, Support answers, Customer1 asks another question … ). I have to:

1.Filter important conversations and only feed gpt-3 with them. 2.prepare and convert them into the right format, so that I can train the model. Is there anone who has some ideas about how to realize these steps and weather to use fine tuning or embeddings?

gpt-3 has to connect the questions to the answers that were given by the support team.

mahdi
  • 11

2 Answers2

2

In many respects this is similar to how support teams might prepare "knowledgebase" articles. There is no panacea to this, you have to decide what constitutes worthwhile knowledge, and filter the emails to locate practical examples of this.

e.g.

From: customer@example.com
To: support@example.com
Subject: Problem with account

Hi,

I'm having trouble accessing my account. Can you help?

Thanks,
Customer

This response is likely to be considered unimportant:

From: support@example.com
To: customer@example.com
Subject: Re: Problem with account

Hello Customer,

I'm sorry to hear that you're having trouble accessing your account. Can you provide more details about the issue you're experiencing?

Regards,
Support Team

But a different response, such as this may be worthwhile:

From: support@example.com
To: customer@example.com
Subject: Re: Problem with account

Hello Customer,

Sorry to hear that you're having trouble accessing your account. Here are some steps you can try to resolve the issue:

1. Make sure you're using the correct username and password. (Check that your caps lock is OFF.)
2. Clear your browser's cache and cookies, and re-try.
3. Try resetting your password.

If you continue to have trouble, please don't hesitate to contact us for further assistance.

Regards,
Support Team

So you must decide which question/answer sets found in email constitute worthwhile learning and then format into an input-output pair for training:

Input: I'm having trouble accessing my account. Can you help?
Output: Sorry to hear that you're having trouble accessing your account. Here are some steps you can try to resolve the issue: 1. Make sure you're using the correct username and password. (Check that your caps lock is OFF.) 2. Clear your browser's cache and cookies, and re-try. 3. Try resetting your password. If you continue to have trouble, please don't hesitate to contact us for further assistance.

Note this will also require stripping emails of all irrelevant information, such as email headers, signatures, urls, marketing etc.

see: https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset

Paul Maxwell
  • 33,002
  • 3
  • 32
  • 51
0

I would use a fine-tuned model to first classify each email on whether it is a basic question that can be answered with AI or requires a human response.

Most likely, you will still want to route certain urgent or complex requests (from high-value customers, perhaps) directly to a real person, and try to have your AI bot handle the simpler ones.

Then I would use embeddings to answer the basic questions from your index of support information.

Mark H
  • 741
  • 8
  • 12