0

I have a pretty complex sphinx index.

Recently I was getting results on an important word in most of my searches and was getting false positives (meaning text records without the word at all).

In order to see what was going on I did show meta to see if there was some synonyn or other issue with the term which was causing the false results.

However Show Meta showed 1 keyword, the one I entered.

total   100000
total_found 1254244
time    6.856
keyword[0]  book
docs[0] 1254244
hits[0] 3037375

Yet the word was found in only a small fraction of the 125k+ records found..

I'm wondering if there is some extension to or alternative SphinxQL to'Show Meta' that will give more information or where a good place to start looking for the cause of such an issue (since I'd think Meta would indicate it but does not).

I checked my cfg and the word is no where to be found (not mapped or referenced).

I checked stopwords and exceptions ditto.

The cfg settings are pretty basic:

exceptions = /etc/sphinxsearch/lemmatizer/exceptions.txt
    stopwords  = /etc/sphinxsearch/lemmatizer/stopwords.txt
    stopword_step = 0

    index_sp=1
    min_word_len = 1
    min_infix_len = 1
    min_stemming_len = 1
 
    #index_field_lengths = 1

    html_strip = 1
    

    enable_star = 1

So I'm not clear where to even start looking for the issue and was hoping there might be some other diagnostic tools more robust than "Show Meta"

user3649739
  • 1,829
  • 2
  • 18
  • 28
  • The main thing that jumps out is enabled infix, although without enable_keywords, it shouldn't be automatically matching part words – barryhunter Mar 02 '22 at 17:54
  • @barryhunter I checked the falst positive documents to see if they had partial word and that wasn't the issue either. – user3649739 Mar 03 '22 at 09:23
  • @barryhunter I turned morphology off entirely as it is as I don't like the unexpexted nature of it and manully index required stems. I suppose if this s the case I can turn infx off? It is quite the mystery now w/o any other tool. Checked my entire config for any word even containing the word in question (e.g. in case I did a regex_replacce w/o \b) and nothing to be found. I don't use the wordforms anymore as I found the config gave me more control and all that is left is exceptions/stopwords which don't contain it either. – user3649739 Mar 03 '22 at 09:32

0 Answers0