I am experimenting with langchains and its applications, but as a newbie, I could not understand how the embeddings and indexing really work together here. I know what these two are, but I can't figure out a way to use the index that I created and saved using persist_directory
.
I succesfully saved the object created by VectorstoreIndexCreator using the following code:
index = VectorstoreIndexCreator(vectorstore_kwargs={"persist_directory":"./custom_save_dir_path"}).from_loaders([loader])
but I cannot find a way to use the .pkl files created. How can I use these files in my chain to retrieve data?
Also, how does the billing in openAI work? If I cannot use any saved embeddings or index, will it re-embed all the data every time I run the code? As a beginner, I am still learning my way around and any assistance would be greatly appreciated.
Here is the full code:
from langchain.document_loaders import CSVLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
import os
os.environ["OPENAI_API_KEY"] = "sk-xxx"
# Load the documents
loader = CSVLoader(file_path='data/data.csv')
#creates an object with vectorstoreindexcreator
index = VectorstoreIndexCreator(vectorstore_kwargs={"persist_directory":"./custom_save_dir_path"}).from_loaders([loader])
# Create a question-answering chain using the index
chain = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="stuff", retriever=index.vectorstore.as_retriever(), input_key="question")
# Pass a query to the chain
while True:
query = input("query: ")
response = chain({"question": query})
print(response['result'])