0
def scorer(self, searcher, fieldname, text, qf=1):
    """Returns an instance of :class:`whoosh.scoring.Scorer` configured
    for the given searcher, fieldname, and term text.
    """

    raise NotImplementedError(self.__class__.__name__)

i do not know the arguments in scorer function.Where are they coming from?and the same to the function under this sentence.If i want to get the term frequences in all collections,not the weight in current doc.How can i do?

def _score(self, weight, length):
    # Override this method with the actual scoring function
    raise NotImplementedError(self.__class__.__name__)
hellolee
  • 43
  • 1
  • 1
  • 6

1 Answers1

0

I guess what you need to do is use whoosh.reading.TermInfo. global term information can be found here. It's updated when new document indexing.

As you said, want to get the term frequencies in all collections, TermInfo().weight() I guess will do it. some sample code like this:

from whoosh.fields import Schema, TEXT
from whoosh.analysis import StemmingAnalyzer
from whoosh.filedb.filestore import FileStorage
from whoosh import scoring

schema = Schema(body=Text(analyzer=StemAnalyzer(), stored=True))

storage = FileStorage("index")
ix = storage.open_index()

def user_weighting_func(searcher, filename, text, matcher):
    return float(searcher.term_info('body', text))

with ix.searcher(weighting=scoring.FunctionWeighting(user_weighting_func)) as searcher:
    qp = QueryParser("body", schema=schema)
    q = qp.parse("hello")
    result = searcher.search(q)
    for hit in results:
        print(hit.score, hit['body'])

In this code, the hit.score will be global term frequency.

Meng Zhao
  • 1
  • 1