1

I am currently working on implementing Vector Space Search (VSS) queries over vector embeddings in Redis. While I understand how to perform queries in a simple scenario with a single set of vectors, I am struggling to adapt this approach to more complex requirements. Here's an overview of what I'm trying to achieve:

I will have a large number of contexts, each containing approximately 50 sub-contexts. Each sub-context will consist of a few hundred embeddings. My goal is to perform queries within a specific context and have the ability to apply filters that include or exclude certain sub-contexts.

Most of the examples I've come across involve setting up a hashset of embeddings and passing it to a Redisearch query, which searches over all the vectors. However, I'm unsure how to proceed, I have a few ideas about which direction to go in but I dont know if they are possible or optimal. These are the two ideas I can think of...

Multiple Hashsets in a Single VSS Query:

Is it possible to provide multiple hashsets to a single VSS query in Redisearch? Ideally, I would like to retrieve the desired sub-context hashsets from Redis and include them in a single query. Is this feasible?

Applying Filters to Hashset Queries:

Is there a way to apply filters to a query on a hashset in Redisearch? If I have all the embeddings in a single hashset, indexed by sub-context ID, is there a mechanism to filter the query results based on the sub-contexts?

Please let me know if any further clarification or details are needed, thanks!

SteroidMan
  • 11
  • 1

1 Answers1

0

I would treat the "context" as a prefix on the keys for your hashes. So say you have ctx1, ctx2 and ctx2. I would create my Hash keys as something like something:ctx:1:xxx where xxx is the actual primary key of the hash, and the number after ctx: is the context. Then I would create multiple indices for the context, using the prefix in the index creation, so you would have indices like something:ctx:1, something:ctx:2, something:ctx:3 which you could search individually, now if there is contexts hierarchy, say something like something:ctx:1:2:xxx I would decide based on usage if I create individual indices for all subcontexts (something:ctx:1:1 and something:ctx:1:2.. N) or just for the parent/encompassing context. Read about ephemeral indices for some ideas https://redis.com/blog/the-case-for-ephemeral-search/

BSB
  • 1,516
  • 13
  • 14