For the following nodejs code below I am getting prompt_tokens = 24 in the response. I want to be able to determine what the expected prompt_tokens should be prior to making the request.
import { Configuration, OpenAIApi } from 'openai';
const configuration = new Configuration({
apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);
const completion = await openai.createChatCompletion({
model: "gpt-4",
messages: [
{role: "system", content: systemPrompt}, //systemPrompt= 'You are a useful assistant.'
{role: "user", content: userPrompt} //userPrompt= `What is the meaning of life?`
]
});
/* completion.data = {
id: 'chatcmpl-72Andnl250jsvSJGbjBJ6YzzFGToA',
object: 'chat.completion',
created: 1680752525,
model: 'gpt-4-0314',
usage: { prompt_tokens: 24, completion_tokens: 91, total_tokens: 115 },
choices: [ [Object] ]
} */
It seems like each model has its own way of encoding and the best lib for that is python tiktoken. Hence if I was to estimate "prompt_tokens". I would need to pass through the "text" value to the script below. However I am not sure what I should be using as the "text" below in the python script for the "messages" above in the nodejs, such that print(token_count) below = 24 [the actual prompt_tokens in the response]
import sys
import tiktoken
text = sys.argv[1]
enc = tiktoken.encoding_for_model("gpt-4")
tokens = enc.encode(text)
token_count = len(tokens)
print(token_count)