1

I am using whoosh to index and search throught my documents. I developped a multi=field search, but I want to specify some "MUST" fields.

What I want is: I when I am searching for a book with a query q1, it search on title and summary, but I want to specify some filters like autor= 'name of author' and category= "books category".

The results must take into account the two 'MUST' field and search on the two others.

Thank you for your help

noaai
  • 59
  • 1

2 Answers2

1

Searcher's search() has parameter filter:

https://whoosh.readthedocs.io/en/latest/searching.html#filtering-results

For filtering you can use a query like

'author:"Name of Author" AND category:Genre'

To build filtering queries automatically, let's assume we have sets authors and categories:

from whoosh.query import *
from whoosh.qparser import MultifieldParser

if categories:

    # Filter the search by selected categories:
    if len(categories) > 1:

        # Must be: Category1 OR Category2 OR Category3...
        cat_q = Or([Term('category', x) for x in categories])

    else:
        # If there's just 1 category in the set:
        cat_q = Term('category', next(iter(categories)))

    print('Query to filter categories:', cat_q)

if authors:
    # Filter the search by authors:
    if len(authors) > 1:

        # Must be: Author1 OR Author2 OR Author3...
        aut_q = Or([Term('author', x) for x in authors])

    else:
        # If there's just 1 author in the set:
        aut_q = Term('author', next(iter(authors)))

    print('Query to filter authors:', au_q)

    # Now combine the two filters:

    if categories:

        # Both fields are used for filtering:
        final_filter = And([cat_q, aut_q])

    else:
        # Only authors:
        final_filter = aut_q

elif categories:
    # Only categories:
    final_filter = cat_q

else:
    # None:
    final_filter = None

print('final_filter:', final_filter)

# Now parse the user query for 2 fields:

parser = MultifieldParser(["title", "summary"], ix.schema)
query = parser.parse(q1)

if final_filter is None:
    results = s.search(query)

else:
    # Using the filter:
    results = s.search(query, filter=final_filter)

chang zhao
  • 144
  • 5
0

You can use whooshMultifieldParser for this scenario

from whoosh.qparser import MultifieldParser

fields = ["title", "summary", "author", "category"]

query = MultifieldParser(fields, schema=idx.schema, group=qparser.OrGroup).parse(q1)
with idx.searcher() as searcher:
    results = searcher.search(query, limit=limit)
    ...........

Above using Or Group which will search on all fields with or operator. According to your need you can customize them . more on operators here like and not etc.

Dharman
  • 30,962
  • 25
  • 85
  • 135
Kekayan
  • 53
  • 1
  • 9