0

For our use case we have to index about 100K documents containing one text field, which most of the time consists of a single word (e.g. nickname), and several numeric fields. Also, we need to be able to run prefix queries on the text field in an efficient way. For instance, we need to get all documents having the text field starting with "fan".

We have gone through the documentation and it looks like using wildcard queries to run prefix search could lead to severe performance issues since a query like “fan*” would be expanded potentially into thousands of terms (we expect the dictionary to contain more than 10K items).

As an alternative and hopefully more efficient approach, we were thinking of using the Redisearch index just for tag and numeric fields, and instead build our own index for the text field (using sets).

The idea would be to first retrieve from our own index all the document IDs matching the prefix query on the text field, then pass these IDs as arguments for the INKEYS parameter of the FT.SEARCH command.

However we’re wondering if using INKEYS with hundreds or thousands of values would basically lead to the same performance issues.

Also, how does INKEYS exactly work? Is it efficient?

pAkY88
  • 6,262
  • 11
  • 46
  • 58

1 Answers1

1

You can try to use MAXPREFIXEXPANSIONS to improve your prefix query performance. If you're using single terms or few terms another hint could be using TAG on the Index creation instead TEXT, TAGs have better performance also.

Besides, I would love to hear more about your use case, please reach me out at adriano.amaral@redis.com

  • thank you! Unfortunately that option would make the search inaccurate as it would exclude many valid results. We thought about using tags instead of text, but if we're not mistaken using prefix search on tags would lead to the same performance issues as we would have hundreds or thousands of tags being used as a filter. – pAkY88 Mar 08 '23 at 08:34