Short version:
Does anyone knows if something happened with EdgeNGramFilterFactory for solr5? It used to work fine on solr 4, but I just upgraded to solr5 and the cores having this fields using this filter refuses to load ...
Long story:
This configuration used to work in solr4.10 (schema.xml):
<field name="NAME" type="string" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="PP" type="text_prefix" indexed="true" stored="false" required="false" multiValued="false"/>
<copyField source="NAME" dest="PP">
<fieldType name="text_prefix" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
</analyzer>
</fieldType>
And the documentation says I did it right (no clear mention if it is for solr4 or solr5).
However, when I am trying to add a collection using this configuration, it fails with the following message:
<lst name="failure">
<str>
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://localhost:8983/solr: Error CREATEing SolrCore 'test_collection': Unable to create core [test_collection] Caused by: Unknown parameters: {side=front}</str>
</lst>
I removed the side=front
"unknown" parameter, started from scratch and it worked - meaning no more errors.
So, while it used to work for solr4 without any additional change, for solr5 it no longer works. Did something changed? Did I miss any doc regarding this filter? Any extra library I need to load to make this work?
And final, if the above is meant to be like this (bug/feature/whatever) - is there any workaround in order to have this "side-substring" indexing-functionality without me having to generate the values when I am adding docs to solr?
Update: with the "hacked" schema (i.e. without side=front
), I indexed the documents and changed the PP
field to be stored. when I searched, it looks like it indexes the entire value. For example, for NAME:ELEPHANT
, I found PP:ELEPHANT
...
PP something like 'E', 'EL', 'ELE'. Also, yes, later on, I did the search 'PP:ELE' and found 'ELEPHANT' - but I had no real explanation on why.
– dcg Mar 03 '15 at 09:21