0

As I understand pyLucene now offers BM25 similarity also. I am using pyLucene - 4.10.1, but can't find any example as to how to use BM25 instead of tf-idf. Please guide.

Dreams
  • 5,854
  • 9
  • 48
  • 71

1 Answers1

1

Try using setSimilarity of IndexSearcher to setup the retrieval model.

import lucene

from java.nio.file import Paths
from org.apache.lucene.store import SimpleFSDirectory
from org.apache.lucene.index import DirectoryReader
from org.apache.lucene.search import IndexSearcher
from org.apache.lucene.search.similarities import BM25Similarity


lucene.initVM(vmargs=['-Djava.awt.headless=true'])
directory = SimpleFSDirectory(Paths.get(INDEX_DIR))
searcher = IndexSearcher(DirectoryReader.open(directory))

searcher.setSimilarity(BM25Similarity())
Salias
  • 480
  • 6
  • 19