0

I am using django 1.5 along with haystack 2.1.0.

While using auto-query on one of the models I found the following behavior.

test_search = "charles ken"

SearchQuerySet().models(Foo, FooSome, FooGone).auto_query(test_search) 

The above query gives multiple results.

test_search = "charles k"

SearchQuerySet().models(Foo, FooSome, FooGone).auto_query(test_search)

The above query gives no results. What am I doing wrong ?

Edit :

The field in concern is edge_ngram

<fieldType name="edge_ngram" class="solr.TextField" positionIncrementGap="1">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front" />
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
      </analyzer>
    </fieldType>
Abhijit Bashetti
  • 8,518
  • 7
  • 35
  • 47
Akash Deshpande
  • 2,583
  • 10
  • 41
  • 82

1 Answers1

1

You need to change the minGramSize here to 1

<fieldType name="edge_ngram" class="solr.TextField" positionIncrementGap="1">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15" side="front" />
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
      </analyzer>
    </fieldType>

But I would recommend to have minGramSize=2 or 3 as it would avoid to many indexes.

Abhijit Bashetti
  • 8,518
  • 7
  • 35
  • 47
  • What if I added a filter on the query side ? EdgeNgramFilter ? I tried that. It worked perfectly for the scenario in concern. Only drawback I can see is that for every query it will break it into parts and then search for it. – Akash Deshpande Jun 11 '15 at 13:31
  • You can add that but I don't think its really required at the query end. because it will give you more result that you may not expect. You are correct on the drawback that for every query it will break the word and as i said will get more response which is not relevant for you. – Abhijit Bashetti Jun 11 '15 at 13:36