I have been trying to query a pdf file in my local directory using LLM, I have downloaded the LLM model I'm using in my local system (GPT4All-13B-snoozy.ggmlv3.q4_0.bin) and trying to use langchain and hugging face's instructor-large model for embedding purpose, I was able to set the service_context and then building index but I'm not able to query , I keeping getting this error regarding prompt..
ValueError: Argument
prompt
is expected to be a string. Instead found <class 'llama_index.prompts.base.Prompt'>. If you want to run the LLM on multiple prompts, usegenerate
instead.
I'm just starting to learn how to use LLM, hope the community helps me....
from llama_index import VectorStoreIndex, SimpleDirectoryReader
from InstructorEmbedding import INSTRUCTOR
from llama_index import PromptHelper, ServiceContext
from llama_index import LangchainEmbedding
from langchain.chat_models import ChatOpenAI
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.llms import OpenLLM
# from langchain.chat_models.human import HumanInputChatModel
from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
documents = SimpleDirectoryReader(r'C:\Users\avish.wagde\Documents\work_avish\LLM_trials\instructor_large').load_data()
model_id = 'hkunlp/instructor-large'
model_path = "..\models\GPT4All-13B-snoozy.ggmlv3.q4_0.bin"
callbacks = [StreamingStdOutCallbackHandler()]
# Verbose is required to pass to the callback manager
llm = GPT4All(model = model_path, callbacks=callbacks, verbose=True)
embed_model = LangchainEmbedding(HuggingFaceEmbeddings(model_name = model_id))
# define prompt helper
# set maximum input size
max_input_size = 4096
# set number of output tokens
num_output = 256
# set maximum chunk overlap
max_chunk_overlap = 0.2
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
service_context = ServiceContext.from_defaults(chunk_size= 1024, llm_predictor=llm, prompt_helper=prompt_helper, embed_model=embed_model)
index = VectorStoreIndex.from_documents(documents, service_context= service_context)
query_engine = index.as_query_engine()
response = query_engine.query("What is apple's finnacial situation")
print(response)
I have been going through, the source code of the library as the error message guides but I couldn't find the problem