0

I'm using GPT-J (EleutherAI/gpt-j-6B) as a chatbot. As a prompt, I provide a sample conversation as shown below. When now a new conversation starts, I append the input of the user to this sample conversation ("Hello, how are you doing?" in the example below).

Now, the problem is that the conversation is sometimes inconsistent because GPT-J might want to continue the sample conversation but the new user input could break that.

How can this be solved?

This is a discussion between a Human and a Chatbot.

Human: Can you do push-ups?

Chatbot: Of course I can. It's a piece of cake! Believe it or not, I can do 30 push-ups a minute.

Human: Really? I think that's impossible!

Chatbot: You mean 30 push-ups?

Human: Yeah!

Chatbot: It's easy. If you do exercise everyday, you can make it, too.

Human: Hello, how are you doing?

Chatbot:

Zoe
  • 27,060
  • 21
  • 118
  • 148
BlackHawk
  • 719
  • 1
  • 6
  • 18

1 Answers1

0

The solution is to simply leave out the pre-prompt conversation. The only pre-prompt you might want to experiment with is the "This is a discussion between a Human and a Chatbot." line. See if it performs better or worse with or without it.

The model was trained on natural speech (in text form), and the model gets confused when you switch topics so abruptly. And rightfully so - wouldn't you be confused if you and your friend were talking about push-ups and then all of a sudden he said "hi how are you?" ?

GTP-J is based on all other GPT models, where the objective is to generate the next token in the sequence. GPT-J is a massive model, and as you can see, it is already quite a good chatbot right out of the gate. If you are unhappy with its current performance as a chatbot, you can try to fine-tune the model on specific datasets that you like, or program some helper algorithms to clean up the conversation flow.

Toakley
  • 182
  • 3
  • 13