1

My code is as below:

SolrQuery query = new SolrQuery();
    query.setQuery(q.trim());
try {
        QueryResponse res = getSolrServer().query(query);
        return res.getResults();
    } catch (SolrServerException sse) {
        log.error(sse);
    }

The problem is that when i have a query more then 3 characters this returns me the response, e.g. query string "che" would respond with results, however the query string "ch" would return me no response. Is there a way i can override the 3 character min length of Solr Query.

is the below xml causing the problem, if yes can i programmatically override it using java

<analyzer type="index">
    <tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize="50" />
    <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>

Appreciate the help in Advance.

Thanks and Regards,

Vaibhav

manuell
  • 7,528
  • 5
  • 31
  • 58
vaibhav
  • 3,929
  • 8
  • 45
  • 81
  • Solr would match what is being indexed. Probably you have a match for che and not for ch. Can you provide more insight on whats being indexed, how it is being indexed ? – Jayendra Aug 23 '12 at 10:02
  • che is just an example the problem is that anything more then 3 characters is searchable however a string less then 3 characters does not responds any results. – vaibhav Aug 23 '12 at 10:05
  • 2
    Whats your configuration for the field ? As there is no such configuration unless you are using LengthFilter in the analyzer chain. – Jayendra Aug 23 '12 at 10:10
  • can you provide me with one such example, appreciate your help on this... – vaibhav Aug 23 '12 at 11:23

1 Answers1

3

NGramTokenizerFactory :-
Default behavior. Note that this tokenizer operates over the whole field. It does not break the field at whitespace. As a result, the space character is included in the encoding.

<analyzer>
  <tokenizer class="solr.NGramTokenizerFactory"/>
</analyzer>

In: "hey man"

Out: "h", "e", "y", " ", "m", "a", "n", "he", "ey", "y ", " m", "ma", "an"

So with your configurations :- minGramSize="3" maxGramSize="50" the items less then 3 would be filtered

For two alphabets words, as you don't have the terms in the index these would never be searchable. You would need to change the minGramSize to 2 to make them searchable.

Jayendra
  • 52,349
  • 4
  • 80
  • 90
  • But can this be overridden in java code posted in the question? – vaibhav Aug 23 '12 at 10:49
  • 1
    nope as you don't have the terms in the index these would never be searchable. You would need to change the minGramSize to 2 to make them searchable. – Jayendra Aug 23 '12 at 10:50
  • can we switch the analyzers depending on the language, actually with language like chinese and japanese we want to provide search based on single character. – vaibhav Aug 23 '12 at 10:52
  • you can define different fields for each language and search on those fields according to the search language. – Jayendra Aug 23 '12 at 10:54