1

Below is my code... and I am looking at the API usage on the openai console and I am way under the limit. I haven't ever been able to get a sucessful response. I am copying the code from their documentation. I keep getting a HTTP 429

const API_URL = "<https://api.openai.com/v1/chat/completions>";
const API_KEY = "YOUR_API_KEY";

const generate = async () => {

  try {
    // Fetch the response from the OpenAI API with the signal from AbortController
    const response = await fetch(API_URL, {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${API_KEY}`,
      },
      body: JSON.stringify({
        model: "gpt-3.5-turbo",
        messages: [{ role: "user", content: promptInput.value }],
      }),
    });

    const data = await response.json();
    resultText.innerText = data.choices[0].message.content;
  } catch (error) {
    console.error("Error:", error);
    resultText.innerText = "Error occurred while generating.";
  }
};
steve landiss
  • 1,833
  • 3
  • 19
  • 30

1 Answers1

1

You can verify what limit you are reaching by looking at the headers.

You could check out your network tab or log the headers out to the console, something like: console.log(response) or console.log(response.headers)

Then in your headers section you should see something like:

headers: {
    ...
    'openai-model': 'gpt-4',
    'x-ratelimit-limit-requests': '200',
    'x-ratelimit-limit-tokens': '40000',
    'x-ratelimit-remaining-requests': '199',
    'x-ratelimit-remaining-tokens': '39691', 
    'x-ratelimit-reset-requests': '300ms',
    'x-ratelimit-reset-tokens': '463ms',
    ...
  },

It's likely you have either used up all your requests per minute or your tokens per minute. Specifically the x-ratelimit-remaining-requests x-ratelimit-remaining-tokens headers will tell you if you've reached your limit by them being zero (or near zero).

To better understand your options check out OpenAI's documentation on rate limits. It has a good amount of helpful information.

You can also see your account specific rate limits within your account area (login required).

Given that each model has a different quota, you could consider spreading your workload over several similar models as a workaround. Some other options include throttling your own code, retry after 1 minute with exponential backoff, sending up less tokens, adjusting your prompt to return less tokens, combine requests if possible, or even asking OpenAi for a rate limit increase.

ToddBFisher
  • 11,370
  • 8
  • 38
  • 54