Create multi-message conversations with the GPT API

Question

I am experimenting with the GPT API by OpenAI and am learning how to use the GPT-3.5-Turbo model. I found a quickstart example on the web:

def generate_chat_completion(messages, model="gpt-3.5-turbo", temperature=1, max_tokens=None):
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {API_KEY}",
    }

    data = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
    }

    max_tokens = 100

    if max_tokens is not None:
        data["max_tokens"] = max_tokens

    response = requests.post(API_ENDPOINT, headers=headers, data=json.dumps(data))

    if response.status_code == 200:
        return response.json()["choices"][0]["message"]["content"]
    else:
        raise Exception(f"Error {response.status_code}: {response.text}")

while 1:
    inputText = input("Enter your message: ")

    messages = [
        {"role": "system", "content": inputText},
    ]

    response_text = generate_chat_completion(messages)
    print(response_text)

With the necessary imports and the API key and endpoint defined above the code block. I added the inputText variable to take text inputs and an infinite while loop to keep the input/response cycle going until the program is terminated (probably bad practice).

However, I've noticed that responses from the API aren't able to reference previous parts of the conversation like the ChatGPT web application (rightfully so, as I have not mentioned any form of conversation object). I looked up on the API documentation on chat completion and the conversation request example is as follows:

[
  {"role": "system", "content": "You are a helpful assistant that translates English to French."},
  {"role": "user", "content": 'Translate the following English text to French: "{text}"'}
]

However, this means I will have to send all the inputted messages into the conversation at once and get a response back for each of them. I cannot seem to find a way (at least as described in the API) to send a message, then get one back, and then send another message in the format of a full conversation with reference to previous messages like a chatbot (or as described before the ChatGPT app). Is there some way to implement this?

Also: the above does not use the OpenAI Python module. It uses the Requests and JSON modules.

Token cost grows exponentially with every new message. So for roughly the same final response, ideally you'd want to send as fewer, larger prompts, instead of many small chunks, even if both strategies amount up to the same number of tokens. — mireazma, Aug 18 '23 at 15:42

score 1 · Answer 1 · answered May 15 '23 at 18:40

I saw your question and was hoping to see some answers, because I have a similar issue of how many previous messages are required, because eventually all those messages will add up and go over the limits. Alas.

In your case, the response_text that comes back is actually a list of choices and you can extract the response, and then add it to the messages array, which builds to become your step by step conversation. The API docs example is the starting point for that. It's an array of messages, keep adding. How big you allow this to grow is your next question, and no doubt it will add to the cost of tokens as well.

score 1 · Answer 2 · answered Jun 30 '23 at 01:51

Here's a very nice example; basically just send the entire message history (along with the 'role' of each message):

import openai

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

# The 2020 World Series was played at Globe Life Field in Arlington, Texas.

Example and more info in the GPT API docs.

score 1 · Answer 3 · answered Aug 01 '23 at 23:11

They clearly create some type of chat object because they send us a chatId, but we have to send the entire chat history and create a new chat every time the user sends a new message?.. That seems like we are either doing it wrong, or it's a pretty convenient limitation for OpenAI considering we have to pay per token used in the chat, that would effectively make us required to pay for every message the user has ever sent, and received every time they post a new question. So if we want to let users have a long conversation, then eventually every single question is going to be very expensive right?

For example, if we pay $0.0015 / 1K tokens on input, and $0.002 / 1K tokens per output let's say that each user message is 100 tokens, and each output is 100 tokens, then each question the user asks is going to increase our tokens per question by 200. so if the user asks 10 questions, then the input tokens required for the 11th question will be 2000 tokens for all of the chat history and 100 tokens for the new question. this means just on input alone for the user to ask this one question it will cost $3.15, and if I have 1000's of users all doing the same thing, this will become very expensive very quickly.

As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). — Community, Aug 05 '23 at 05:15
My math says the total tokens up to the current message pair = `((no of message pairs ^ 2) - 1) * 100`. So tokens grow exponentially, i.e. 10 message pairs = 9,900 tokens. 100 message pairs = 999,900 tokens. — mireazma, Aug 18 '23 at 15:31

Create multi-message conversations with the GPT API

3 Answers3