Solr 6.4.2, filter documents with startswith string

Question

How can I filter documents by field which starts with some string? Now I'm getting all documents where field contains words which are starts with this string. Best result will be, if some one answer how get exact start with results first, and then remain, like ordering by most nearest to filter. Thanks.
Like:

company_name:(max*)
result : ['Min & Max', 'Maximum speed', 'Mirana max parrot']

But I want it like :

company_name:(max*)
result : ['Maximum speed', 'Min & Max', 'Mirana max parrot']

Now I have this config for text field:

     <fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" />
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPossessiveFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="15" />
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPossessiveFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.EnglishMinimalStemFilterFactory"/>
         <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="15" />
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
    </fieldType>

score 0 · Answer 1 · answered Feb 13 '17 at 15:44

You're going to have to use a string field (implemented as StrField) or a TextField with a KeywordTokenizer as the tokenizer class.

The reason for this is that the wildcard match is performed against the tokens, and when the string is split into multiple tokens, each token will match the wildcard. The string class keeps the whole string as a single token, while the KeywordTokenizer does the same - but using the KeywordTokenizer allows you to specify other filters to process the string as well, such as lowercasing the string before storing the token.

score 0 · Answer 2 · answered Feb 14 '17 at 03:34

If you are using EdgeNGrams, you don't need to give * in the query. You just need to give your prefix. Also, the EdgeNGram should only be in the index configuration, but not in the query one. At the moment, you are basically saying match by first 3 character regardless of the rest.

I suggest you fix those two things and try again (reload, but no need to reindex as indexing pipeline did not change).

Solr 6.4.2, filter documents with startswith string

2 Answers2