0

I am building an app to use Opensearch as vecotr store with Llamaindex using this example. Here is the code I have:

    endpoint = getenv("OPENSEARCH_ENDPOINT", "http://localhost:9200")
    idx = getenv("OPENSEARCH_INDEX", "gpt-index-demo")
    
    UnstructuredReader = download_loader("UnstructuredReader")

    loader = UnstructuredReader()
    documents = loader.load_data(file=Path(file_name))

    # OpensearchVectorClient stores text in this field by default
    text_field = "content"
    # OpensearchVectorClient stores embeddings in this field by default
    embedding_field = "embedding"
    # OpensearchVectorClient encapsulates logic for a
    # single opensearch index with vector search enabled
    client = OpensearchVectorClient(endpoint, idx, 1536, embedding_field=embedding_field, text_field=text_field)
    # initialize vector store
    vector_store = OpensearchVectorStore(client)
    # initialize an index using our sample data and the client we just created
    index = GPTVectorStoreIndex.from_documents(documents=documents)

The confusion I am having is that when we create OpensearchVectorClient, we are not passing any documents and then when we initialize index, we do not pass vector_store, nor is there a eligible field to pass this by looking at the definition of GPTVectorStoreIndex.from_documents(). So, how does the document gets to Elasticsearch index?

user2966197
  • 2,793
  • 10
  • 45
  • 77

1 Answers1

0

GPTVectorStoreIndex.from_documents() accepts storage contexts as one of its parameters. Create a StorageContext.from_defaults() that accepts a vectorstore like OpenSearch and use the storage context as a parameter in the above method (from documents) .