Django-Haystack autoquery gives strange results with solr backend

Question

I am using django 1.5 along with haystack 2.1.0.

While using auto-query on one of the models I found the following behavior.

test_search = "charles ken"

SearchQuerySet().models(Foo, FooSome, FooGone).auto_query(test_search)

The above query gives multiple results.

test_search = "charles k"

SearchQuerySet().models(Foo, FooSome, FooGone).auto_query(test_search)

The above query gives no results. What am I doing wrong ?

Edit :

The field in concern is edge_ngram

<fieldType name="edge_ngram" class="solr.TextField" positionIncrementGap="1">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front" />
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
      </analyzer>
    </fieldType>

does this involve schema.xml? if yes whats the field type used for the field you are searching? — Abhijit Bashetti, Jun 10 '15 at 14:23
@AbhijitBashetti I have edited the question to include the field types. — Akash Deshpande, Jun 10 '15 at 14:52
ok. which uses what? meaning which field is using whats type? It would be good if your share the fieldType details? like whats the analyser,tokeniser and filter it consists of? — Abhijit Bashetti, Jun 10 '15 at 14:55

score 1 · Answer 1 · answered Jun 11 '15 at 05:59

You need to change the minGramSize here to 1

<fieldType name="edge_ngram" class="solr.TextField" positionIncrementGap="1">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15" side="front" />
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
      </analyzer>
    </fieldType>

But I would recommend to have minGramSize=2 or 3 as it would avoid to many indexes.

What if I added a filter on the query side ? EdgeNgramFilter ? I tried that. It worked perfectly for the scenario in concern. Only drawback I can see is that for every query it will break it into parts and then search for it. — Akash Deshpande, Jun 11 '15 at 13:31
You can add that but I don't think its really required at the query end. because it will give you more result that you may not expect. You are correct on the drawback that for every query it will break the word and as i said will get more response which is not relevant for you. — Abhijit Bashetti, Jun 11 '15 at 13:36

Django-Haystack autoquery gives strange results with solr backend

1 Answers1