I am loading a csv file with service now incident details and storing in chroma db with hugging face embedding. I am trying to retrieve specific Incident number information but it will give another incident number details, this happens when my k value is 5 or below.
search_kwargs={"k": target_source_chunks}
If i increase k value to 10 it will have my incident number in the result, but how do i improve the response so it gives top 5 result as my relevant details.
I am using below python coding.
loader_class, loader_args = (CSVLoader, {})
loader = loader_class(SourceFilePath, **loader_args)
results = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
texts = text_splitter.split_documents(results)
embeddings = HuggingFaceEmbeddings(model_name='all-MiniLM-L6-v2')
db = Chroma.from_documents(texts, embeddings, persist_directory = my_vector_db, client_settings = CHROMA_SETTINGS)
retriever = db.as_retriever(search_type="similarity",search_kwargs={"k": target_source_chunks})
question = 'Please give me summary of number INC0000063 ?'
docs_rel = retriever.get_relevant_documents(question)
Please suggest how do i improve my response?