1

I have an OpenAI GPT model deployed in an instance belonging to a resource in my Azure subscription. I have two programs that use this OpenAI GPT model. How can I keep track of the expenses of each program separately?


Example: I deployed the OpenAI GPT model "GPT 4 32k" as gpt-4-32k-viet. Program A and program B use this model. How can I keep track of the expenses of incurred by program A and program B separately?

enter image description here

I use the code from the Azure OpenAI tutorial:

import tiktoken
import openai
import os
openai.api_type = "azure"
openai.api_version = "2023-03-15-preview"
openai.api_base = "https://[resourcename].openai.azure.com/" # Your Azure OpenAI resource's endpoint value .
openai.api_key = "[my instance key]"


system_message = {"role": "system", "content": "You are a helpful assistant."}
max_response_tokens = 250
token_limit= 4096
conversation=[]
conversation.append(system_message)


def num_tokens_from_messages(messages, model="gpt-4-32k"):
    encoding = tiktoken.encoding_for_model(model)
    num_tokens = 0
    for message in messages:
        num_tokens += 4  # every message follows <im_start>{role/name}\n{content}<im_end>\n
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":  # if there's a name, the role is omitted
                num_tokens += -1  # role is always required and always 1 token
    num_tokens += 2  # every reply is primed with <im_start>assistant
    return num_tokens


user_input = 'Hi there. What is the difference between Facebook and TikTok?'
conversation.append({"role": "user", "content": user_input})
conv_history_tokens = num_tokens_from_messages(conversation)

while (conv_history_tokens + max_response_tokens >= token_limit):
    del conversation[1]
    conv_history_tokens = num_tokens_from_messages(conversation)

response = openai.ChatCompletion.create(
    engine="gpt-4-32k-viet",  # The deployment name you chose when you deployed the ChatGPT or GPT-4 model.
    messages=conversation,
    temperature=.7,
    max_tokens=max_response_tokens,
)

conversation.append({"role": "assistant", "content": response['choices'][0]['message']['content']})
print("\n" + response['choices'][0]['message']['content'] + "\n")
Talha Tayyab
  • 8,111
  • 25
  • 27
  • 44
Franck Dernoncourt
  • 77,520
  • 72
  • 342
  • 501

2 Answers2

1

You have to enclose them in different Resource Groups.

You can then target that Resource Group and group by Service Name if you want to have a more granular view:

enter image description here

We are having an hard time trying to understand the real costs of GPT and the only way I suggest you is t test like crazy.

If you feed that language model with 500 characters you have a cost.

But if you feed it with 5,000 characters don't expect to have the same cost x10.

Is difficult to forecast so what I suggest you is containerize per Resource Group. This technology is not designed to be multi-tenant, you will lose the costs. If you want to know how much your customer have consumed the only way is to go single-tenant.

Otherwise you have to create an ID per customer and link each token to that ID. And good luck with that.

Francesco Mantovani
  • 10,216
  • 13
  • 73
  • 113
  • Thanks! "You have to enclose them in different Resource Groups." that's indeed what I was doing to some extent, but there used to be a limit of 3 Resource Groups per location per Azure account, while I had >20 programs to track. Looks like the limit is 30 now, so more feasible (but now I have >30 programs and only 1 possible location...) – Franck Dernoncourt Jul 24 '23 at 17:03
  • @FranckDernoncourt , I know. But this is a new technology and as you can see they already changed the limit from 3 to 30. The locations are 2 as far as I can recall but Microsoft indeed wants to make more $$$ with GPT so expect this service to be extended. – Francesco Mantovani Jul 25 '23 at 08:11
  • There's no much you can do and even C-level and board members needs to understand that this a new technology. Azure Support doesn't even has a category for GPT.... just to give you an idea how new is this. And good luck find people with experience in GPT. It just came out in November and was massively updated in February. No one is an expert in this field – Francesco Mantovani Jul 25 '23 at 08:14
0

Perfect for anyone who needs to quickly calculate the token amount of ChatGPT in prompts for their project.

https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/

Ramprasad
  • 87
  • 4