0

I had been working on a small test project in Azureml notebook environment. I had been using a relatively small chromadb to perform some vector search. Now I need to perform this task in a Azure pipeline and would like to upload this chromadb into Azure Blob Storage. It seems like I cannot upload the the chromadb directly into blob, and hence I looking for an alternative. Anyone know how this can be achieved. I had been also trying to use Azure Cognitive Search, and I am running into other numerous issues with Python SDK.

import json
vectordb = Chroma(persist_directory=chroma_db_path,
                   embedding_function=embeddings)
db_data = vectordb.to_json()
json_string = json.dumps(db_data)

Error: AttributeError: 'Chroma' object has no attribute 'to_json'

Thank you

stackword_0
  • 185
  • 8

1 Answers1

-1

I haven't tried it myself (i.e. saving database to blob) but when I persisted the database using persist(), Chroma created a SQLite database by the name chroma.sqlite in the directory specified in chroma_db_path.

If your objective is to persist the entire database, one possible solution would be to upload this file as is in blob storage.

Gaurav Mantri
  • 128,066
  • 12
  • 206
  • 241