I attempted upgrade from Hibernate Search 5.8.0.CR1
to 5.8.2.Final
and from ElasticSearch 2.4.2
to 5.6.4
.
When I run my application I'm getting the following error:
Status: 400 Bad Request
Error message: {"root_cause":[{"type":"illegal_argument_exception",
reason":"Fielddata is disabled on text fields by default.
Set fielddata=true on [title] in order to load fielddata in memory by uninverting the inverted index.
Note that this can however use significant memory. Alternatively use a keyword field instead."}]
I read about Fielddata here: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/fielddata.html#_fielddata_is_disabled_on_literal_text_literal_fields_by_default But I'm not sure how to address this issue, especially from Hibernate Search.
My title
field definition looks like this:
@Field(name = "title", analyzer = @Analyzer(definition = "my_collation_analyzer"))
@Field(name = "title_polish", analyzer = @Analyzer(definition = "polish"))
protected String title;
I'm using the following analyzer definition:
@AnalyzerDef(name = "my_collation_analyzer",
tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class), filters = { @TokenFilterDef(
name = "polish_collation", factory = ElasticsearchTokenFilterFactory.class, params = {
@org.hibernate.search.annotations.Parameter(name = "type", value = "'icu_collation'"),
@org.hibernate.search.annotations.Parameter(name = "language", value = "'pl'") }) })
(Analyzer polish
comes from plugin analysis-stempel
.)
Elasticsearch notes on Fielddata recommend changing the type of the field
from text
to keyword
, or setting fielddata=true
, but I'm not sure
how to do it using Hibernate Search annotations because there are no such
properties in annotation @Field
.
Update:
Thank you very much for the help on this. I changed my code to this:
@NormalizerDef(name = "my_collation_normalizer",
filters = { @TokenFilterDef(
name = "polish_collation_normalization", factory = ElasticsearchTokenFilterFactory.class, params = {
@org.hibernate.search.annotations.Parameter(name = "type", value = "'icu_collation'"),
@org.hibernate.search.annotations.Parameter(name = "language", value = "'pl'") }) })
...
@Field(name = "title_for_search", analyzer = @Analyzer(definition = "polish"))
@Field(name = "title_for_sort", normalizer = @Normalizer(definition = "my_collation_normalizer"))
@SortableField(forField = "title_for_sort")
protected String title;
Is it ok? As I understand there should be no tokenization in a normalizer, but I'm not sure what else to use instead of @TokenFilterDef
and factory = ElasticsearchTokenFilterFactory.class
(?).
Unfortunately I'm also getting the following error:
Error message: {"root_cause":
[{"type":"illegal_argument_exception",
"reason":"Custom normalizer [my_collation_normalizer] may not use filter
[polish_collation_normalization]"}]
I need collation for sorting, as described in my previous question here: ElasticSearch - define custom letter order for sorting
Update 2:
I tested ElasticSearch version 5.6.5
and I think it allows icu_collation in normalizers (my annotations were accepted).