0

I'm noticing that searches like *something consume huge amounts of cpu. I'm using whoosh 2.4.1. I suppose this is because I don't have indexes covering this search case. something* works fine. *something doesnt't.

How do you deal with these queries? Is there a special way to declare your schemas which makes this kind of queries possible?

Thanks!

Giuc
  • 125
  • 1
  • 8

1 Answers1

3

That's a quite fundamental problem: prefixes are usually easy to find (like when searching foo*), postfixes are not (like *foo).

Prefixes + Wildcard searches get optimized to first do a fast prefix search and then a slow wildcard search on the results given in the first step.

You can't do that optimization with Wildcard + Postfix. But there is a trick:

If you really need that often, you could try indexing a reversed string (and also searching for the reversed search string), so the postfix search becomes a prefix search:

Somehow like:

add_document(title=title, title_rev=title[::-1])
...
# then query = u"*foo"[::-1], search in title_rev field.
Thomas Waldmann
  • 501
  • 2
  • 7