Questions tagged [chromadb]

69 questions
0
votes
0 answers

Query bot on multiple JSON files on Langchain

I have around 30 GB of JSON data with multiple files, wanted build query bot on this. I have built same with text file but i am not sure how it will work for JSON data. I have explored JSONLoader but dont know how to use this to convert JSON data…
Juned Ansari
  • 5,035
  • 7
  • 56
  • 89
0
votes
0 answers

Generating and using ChromaDB ids

I'm wondering how people deal with the ids in Chroma DB. I plan to store code-snippets (let's say single functions or classes) in the collection and need a unique id for each. These documents are going to be generated so the first problem is: how do…
samala7800
  • 21
  • 6
0
votes
1 answer

Chroma document retrieval in langchain not working in Flask frontend

I am using langchain to create a chroma database to store pdf files through a Flask frontend. I am able to query the database and successfully retrieve data when the python file is ran from the command line. I am however trying to use the the…
Moibah
  • 1
  • 1
0
votes
1 answer

Chromadb + Langchain + SentenceTransformerEmbeddingFunction throwing 'SentenceTransformerEmbeddingFunction' object has no attribute 'embed_documents'

I have been trying to use Chromadb version 0.4.8 Langchain version 0.0.276 with SentenceTransformerEmbeddingFunction as shown in the snippet below. from langchain.vectorstores import Chroma from chromadb.utils import embedding_functions # other…
0
votes
0 answers

Chromadb + Langchain with SentenceTransformerEmbeddingFunction throwing sqlite3 >= 3.35.0 error, despite sqlite3 3.43.0 being available

I have been trying to use Chromadb version 0.4.8 Langchain version 0.0.276 with SentenceTransformerEmbeddingFunction as shown in the snippet below. from langchain.vectorstores import Chroma from chromadb.utils import embedding_functions # other…
Sanjay
  • 363
  • 1
  • 3
  • 14
0
votes
0 answers

Delete memory of openai queries

I have these lines of code to get an answer from OpenAi which gets data from a query and a document. However, when responding, it uses data from previous queries, which I don't want, since the response should only be based on the data from my…
Lukas
  • 1
0
votes
0 answers

Failed building wheel for chroma-hsnwlib , ( #include doesn't exist ) in Ubuntu

I am getting error "Failed building wheel for chroma-hsnwlib " in Ubuntu server. In chronology : /tmp/pip-build-env/overlay/lib/python3.10/site-packages/pybind11/include/pybind11/detail/../detail/common.h: 226:10 : Python.h no such file or…
0
votes
0 answers

chromdb query raise "RuntimeError: Cannot open file" error

Please help me to fix the issue, thanks so much! chromadb == 0.3.26 langchain == 0.0.212 a=self.collection.peek() query_results = self.collection.query( query_embeddings = [embeddingdata_for_query], n_results=5 ) peek()…
Harly Chen
  • 33
  • 4
0
votes
1 answer

how to check for duplicate documents in vectorstore efficiently?

How can i check for duplicate documents in my vectorstore, when adding documents? Currently I am doing something like: vectorstore = Chroma( persist_directory=persist_dir, embedding_function=embeddings ) documents =…
information_interchange
  • 2,538
  • 6
  • 31
  • 49
0
votes
1 answer

How to check number of documents in vectorstore in langchain?

from langchain.vectorstores import Chroma vectorstore = Chroma.from_documents(documents=final_docs, embedding=embeddings, persist_directory=persist_dir) how can I check the number of documents or emebddings inside vectorstore?
information_interchange
  • 2,538
  • 6
  • 31
  • 49
0
votes
0 answers

Chroma Vector Db when i query it gives me same wrong answer every time with every search type?

I am loading a csv file with service now incident details and storing in chroma db with hugging face embedding. I am trying to retrieve specific Incident number information but it will give another incident number details, this happens when my k…
iVikashJha
  • 159
  • 1
  • 2
  • 14
0
votes
1 answer

Chatbot using csv file

I am trying to create a chatbot using Azure bot service and Azure open ai. The data source is multiple csv files. I am able to create embedding using langchain chroma extension. But while querying the embedding I am not getting the correct…
0
votes
1 answer

How JSON serialize Chromadb

I had been working on a small test project in Azureml notebook environment. I had been using a relatively small chromadb to perform some vector search. Now I need to perform this task in a Azure pipeline and would like to upload this chromadb into…
0
votes
1 answer

Can we make create Collection in ChromaDB faster?

This is the code for creating collection in ChromaDB: client = chromadb.Client() collection = client.create_collection( name="collection_name", metadata={"hnsw:space": "cosine"} ) and this is for adding data to collection: collection.add( …
0
votes
1 answer

How to make my LLM powered chatbot using ChromaDb Faster?

I am building a LLM powered chatbot. Using ChromaDb for searching relevant documents and then LLM to answer. Any method to get faster responses ?? Currently it takes around 15 seconds to answer. Around 8 seconds for ChromaDb to find relevant…