0

Microsoft recently released Semantic Kernel on Azure. It is a mechanism around the Azure OpenAI API, similar to LangChain, but both in C# and Python. It has a bunch of examples, and I am trying to run Example38 in my project. Specifically, this code:

public static async Task RunAsync()
{
    using (Log.VerboseCall())
    {
        string apiKey = "...xxxxxxxxxxxxxxxxxxx...";  // I got this from Pinecone
        string pineconeEnvironment = "us-west1-gcp-free"; // I got this from Pinecone

        string openAiKey = "...xxxxxxxxxxxxxxxxxxxxx..."; // I got this from OpenAI

        PineconeMemoryStore memoryStore = new(pineconeEnvironment, apiKey);
        IKernel kernel = Kernel.Builder
            .WithOpenAITextCompletionService("text-davinci-003", openAiKey)
            .WithOpenAITextEmbeddingGenerationService("text-embedding-ada-002", openAiKey)
            .WithMemoryStorage(memoryStore)
            .Build();


        Console.WriteLine("== Printing Collections in DB ==");

        IAsyncEnumerable<string> collections = memoryStore.GetCollectionsAsync();

        await foreach (string collection in collections)
        {
            Console.WriteLine(collection);
        }

        Console.WriteLine("== Adding Memories ==");

        Dictionary<string, object> metadata = new()
        {
            { "type", "text" },
            { "tags", new List<string>() { "memory", "cats" } }
        };

        string additionalMetadata = System.Text.Json.JsonSerializer.Serialize(metadata);

        try
        {
            // !!! This line throws exception - see below. !!!
            string key1 = await kernel.Memory.SaveInformationAsync(MemoryCollectionName, "british short hair", "cat1", null, additionalMetadata);
            string key2 = await kernel.Memory.SaveInformationAsync(MemoryCollectionName, "orange tabby", "cat2", null, additionalMetadata);
            string key3 = await kernel.Memory.SaveInformationAsync(MemoryCollectionName, "norwegian forest cat", "cat3", null, additionalMetadata);

            Console.WriteLine("== Retrieving Memories Through the Kernel ==");
            MemoryQueryResult? lookup = await kernel.Memory.GetAsync(MemoryCollectionName, "cat1");
            Console.WriteLine(lookup != null ? lookup.Metadata.Text : "ERROR: memory not found");

            Console.WriteLine("== Retrieving Memories Directly From the Store ==");
            var memory1 = await memoryStore.GetAsync(MemoryCollectionName, key1);
            var memory2 = await memoryStore.GetAsync(MemoryCollectionName, key2);
            var memory3 = await memoryStore.GetAsync(MemoryCollectionName, key3);

            Console.WriteLine(memory1 != null ? memory1.Metadata.Text : "ERROR: memory not found");
            Console.WriteLine(memory2 != null ? memory2.Metadata.Text : "ERROR: memory not found");
            Console.WriteLine(memory3 != null ? memory3.Metadata.Text : "ERROR: memory not found");

            Console.WriteLine("== Similarity Searching Memories: My favorite color is orange ==");
            IAsyncEnumerable<MemoryQueryResult> searchResults = kernel.Memory.SearchAsync(MemoryCollectionName, "My favorite color is orange", 1, 0.8);

            await foreach (MemoryQueryResult item in searchResults)
            {
                Console.WriteLine(item.Metadata.Text + " : " + item.Relevance);
            }
        }
        catch (Exception ex)
        {
            Log.Verbose(ex);
        }
    }
}

I get the following exception at the indicated line:

Index creation is not supported within memory store. It should be created manually or using CreateIndexAsync. Ensure index state is Ready.

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
Leon
  • 165
  • 12

2 Answers2

1

In the current SK memory design, vector index creation is assumed to be a quick operation, taking only milliseconds to a few seconds (similar to Azure Search and Qdrant). However, when Pinecone was integrated, it was discovered that index creation takes significantly longer, sometimes requiring minutes to complete. This necessitates polling the API to check if the index is ready, leading to unexpected timeouts in applications designed to create indexes on-the-fly (e.g., example 38).

As a workaround, the Pinecone integration in SK currently requires app developers to create indexes manually, either through the portal or by using Microsoft.SemanticKernel.Connectors.Memory.Pinecone.PineconeClient.CreateIndexAsync(). This avoids the need to queue the operation and wait for an extended period while polling the service.

A potential improvement could be adding a comment in example 38 to clarify this behavior or providing an option to override the default behavior and allow waiting. This would be a valuable enhancement for those interested in implementing it.

Devis L.
  • 313
  • 2
  • 11
  • By this time I already figured out that I need to create Index on my own, and therefore I am doing it in code with CreateIndexAsync. However, since it is part of initialization code in my app, I first check in index exists in the list of indexes that I get from ListIndexesAsync. If I do create index (that is, if it does not exist yet) then I poll it until its status id Ready. However, the first call to check the status of the index throws exception too... – Leon Jun 16 '23 at 16:47
  • The same buggy behavior happens with Qdrant as well... Something is really fishy with this design. Anyway, I have not managed to pass through this line: string key1 = await kernel.Memory.SaveInformationAsync... – Leon Jun 16 '23 at 16:52
  • The problem with Qdrant is odd, I've used it for several months, creating hundreds of collections and never seen exceptions. I've used it mostly locally though, via Docker, is it the same setup or a different one? – Devis L. Jun 16 '23 at 22:40
  • Not the same setup - I am talking here about using it through Azure's Semantic Kernel. – Leon Jun 17 '23 at 19:58
0

I am also doing tests to integrate Pinecone with SK. On the line:

string key1 = await kernel.Memory.SaveInformationAsync(MemoryCollectionName, "british short hair", "cat1", null, additionalMetadata);

I have tried to put the name of the index in the first parameter and it has worked for me.

When you refer to example 38, where are these examples?

Jeremy Caney
  • 7,102
  • 69
  • 48
  • 77
Jose Polo
  • 1
  • 1
  • I will try your suggestion shortly. In the example this MemoryCollectionName is different than the index name. I also tried to create new collection while I am creating the index, but that does not work either throwing exception... – Leon Jun 16 '23 at 16:54
  • I tried that, and still get the same exception despite that I create index and then use index name in SaveInformationAsync. MS definitely needs to get its act together - I also print collection names and apparently the collection which created implicitely by creating an index has the same name as the index - bad design - source for confusion. – Leon Jun 16 '23 at 17:06
  • FYI the examples are here: https://github.com/microsoft/semantic-kernel/tree/main/samples/dotnet/kernel-syntax-examples – Devis L. Jun 16 '23 at 22:42