I have been using langchain for extracting data from a large text document. It has all been working fine until I have encountered this problem: Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 10.0 seconds as it raised RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization xxx on tokens per min. Limit: 150000 / min. Current: 66543 / min. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
I have lots of credit on my account and I am not sure why it says I am over the limit when my current is much less. I do not even get prompted to ask a question yet, it happens as soon as I run the program.
import os
import sys
import openai
from langchain.chains import ConversationalRetrievalChain, RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import DirectoryLoader, TextLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.indexes import VectorstoreIndexCreator
from langchain.indexes.vectorstore import VectorStoreIndexWrapper
from langchain.llms import OpenAI
from langchain.vectorstores import Chroma
from langchain.prompts import PromptTemplate
import constants
# Set the OPENAI API Key
os.environ["OPENAI_API_KEY"] = constants.APIKEY
# Configure whether to persist the index
PERSIST = False
query = None
# If command line arguments are provided, assign the first argument to 'query'
if len(sys.argv) > 1:
query = sys.argv[1]
# Load index from persistence or create a new one
if PERSIST and os.path.exists("persist"):
print("Reusing index...\n")
vectorstore = Chroma(persist_directory="persist", embedding_function=OpenAIEmbeddings())
index = VectorStoreIndexWrapper(vectorstore=vectorstore)
else:
loader = TextLoader("DOS5.txt" , encoding="utf-8")
#loader = DirectoryLoader("DOS6.txt")
if PERSIST:
index = VectorstoreIndexCreator(vectorstore_kwargs={"persist_directory":"persist"}).from_loaders([loader])
else:
index = VectorstoreIndexCreator().from_loaders([loader])
chain = ConversationalRetrievalChain.from_llm(
llm=ChatOpenAI(model="gpt-3.5-turbo"),
retriever=index.vectorstore.as_retriever(search_kwargs={"k": 1}),
)
chat_history = []
# Interaction loop with the AI model
while True:
if not query:
query = input("Prompt: ")
if query in ['quit', 'q', 'exit']:
sys.exit()
# Generate an answer using the conversational chain
result = chain({"question": query, "chat_history": chat_history})
# Print the answer
print("\n")
print(result['answer'])
print("\n")
# Add the query and answer to chat history
chat_history.append((query, result['answer']))
#Reset query to none for next iteration
query = None
I have tried adding more credit, adding a time delay, waiting many minutes/hours but these have not worked. I am wondering if this problem is due to the size of the document, however as it was working before maybe something to do with the chat history?