3

I’m using Whoosh with Haystack. Haystack does not abstract the keyword extraction in Whoosh, so I’m using Whoosh directly for this feature.

@property
def keywords(self):
    whoosh_backend = SearchForm().searchqueryset.query.backend
    if not whoosh_backend.setup_complete:
        whoosh_backend.setup()
    with whoosh_backend.index.searcher() as searcher:
        return dict(searcher.key_terms_from_text('text', self.text, normalize=False))

The result for this article:

{'attribut': 0.5051433470901747, 'donor': 0.5002458466718036, 'chariti': 1.0, 'factor': 0.5061597699646493, 'impact': 0.7091847877188343}

Unfortunately the keyword extraction in Whoosh appears to return stemmed keywords. Is there a way to get unstemmed versions of them, say, the most frequent or the shortest form? (The spelling corrector somehow manages to return unstemmed words.) Thanks!

Dawn Drescher
  • 901
  • 11
  • 17

0 Answers0