1

I am creatign 2 apps using Llamaindex. One allows me to create and store indexes in Chroma DB and other allows me to later load from this storage and query.

Here is my code to load and persist data to ChromaDB:

import chromadb
from chromadb.config import Settings
chroma_client = chromadb.Client(Settings(
    chroma_db_impl="duckdb+parquet",
    persist_directory=".chroma/" # Optional, defaults to .chromadb/ in the current directory
))
chroma_collection = chroma_client.get_or_create_collection("quickstart")

def chromaindex():
    
    
    UnstructuredReader = download_loader("UnstructuredReader")
    loader = UnstructuredReader()
    documents = loader.load_data(file= Path())

    # create chroma vector store
    vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    index = GPTVectorStoreIndex.from_documents(documents, storage_context=storage_context)

    index.storage_context.persist(vector_store_fname = 'demo')

Here is my code to later load the storage context and query:

import chromadb
from chromadb.config import Settings
chroma_client = chromadb.Client(Settings(
    chroma_db_impl="duckdb+parquet",
    persist_directory=".chroma/" # Optional, defaults to .chromadb/ in the current directory
))
chroma_collection = chroma_client.get_collection("quickstart")

def chroma_ans(question):
    
    vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
    sc = StorageContext.from_defaults(vector_store=vector_store)
    
    index2 = load_index_from_storage(sc)
    query_engine = index2.as_query_engine()
    response = query_engine.query("What did the author do growing up?")
    return response

When I run the 2nd code to query, I get ValueError: No index in storage context, check if you specified the right persist_dir.. I am not sure where I am making the mistake. ALl I want to do is in first app, create storage context and index and store then using Chroma DB and in second app load them again to query.

My llamindex version is 0.6.9

user2966197
  • 2,793
  • 10
  • 45
  • 77
  • Do you specifically need `chromadb` for your operation ? If not, you can directly save and load it from disk using the documentation – Vivek May 25 '23 at 09:05

1 Answers1

1

I had the same problem, I solved updating chromadb to the latest 0.4...

TO SAVE TO THE DISK

chroma_client = chromadb.PersistentClient(path="./dist/vdb", settings=Settings(
    anonymized_telemetry=False
))

chroma_collection = chroma_client.get_or_create_collection("quickstart")

vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store ) # <- here don't specify persist_dir

cur_index = GPTVectorStoreIndex.from_documents(data, storage_context=storage_context, service_context=service_context)

# Here you save to the path you want
cur_index.storage_context.persist(persist_dir="./dist/vdb/liama")

TO LOAD FROM DISK

chroma_client = chromadb.PersistentClient(path="./dist/vdb", settings=Settings(
    anonymized_telemetry=False
))

chroma_collection = chroma_client.get_or_create_collection("quickstart")

vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store, , persist_dir="./dist/vdb/liama" ) # <- here you DO specify persist_dir

# Now you can load the index
cur_index = GPTVectorStoreIndex([], storage_context=storage_context, service_context=service_context)
Gust
  • 46
  • 5