I need to perform highlighting for multiple words into the same field but for each one using a specific formatter (prefix and postfix).
Let's say that I have the description
field and for a document it has the value: Einstein always excelled at math and physics from a young age
. How to highlight math
with a pair of a specific prefix and postfix AND ALSO physics
with a different prefix-postfix pair? So, in the end I would like to obtain:
Einstein always excelled at <em class="hl-red">math</em> and <em class="hl-green">physics</em> from a young age
The reason is that in the frontend I have different CSS classes with background-color: red;
for hl-red
and background-color: green
for hl-green
for example.
However, I was managed to highlight multiple words into the same field but with the same prefix-postfix pair all over the places, which is not what I want actually. In addition, I tried to add multiple HtmlFormatter entries in solrconfig.xml:
<highlighting>
..............
<formatter name="html" default="true" class="solr.highlight.HtmlFormatter">
<lst name="defaults">
<str name="hl.simple.pre"><![CDATA[<em>]]></str>
<str name="hl.simple.post"><![CDATA[</em>]]></str>
</lst>
<lst name="hl-red">
<str name="hl.simple.pre"><![CDATA[<em class="hl-red">]]></str>
<str name="hl.simple.post"><![CDATA[</em>]]></str>
</lst>
<lst name="hl-green">
<str name="hl.simple.pre"><![CDATA[<em class="hl-green">]]></str>
<str name="hl.simple.post"><![CDATA[</em>]]></str>
</lst>
</formatter>
..............
</highlighting>
but I got HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr: Unknown formatter: hl-green
. Also, I didn't find a way to specify an array of prefixes in Solr Admin UI nor in spring-data-solr, just a simple query like this:
SimpleHighlightQuery query = new SimpleHighlightQuery(Objects.requireNonNull(criteria));
HighlightOptions highlightOptions = new HighlightOptions()
.addFields(fields)
.setSimplePrefix(prefix)
.setSimplePostfix(postfix);
query.setHighlightOptions(highlightOptions);
query.setPageRequest(pageable);
return solrTemplate.queryForHighlightPage(MY_CORE, query, MyModel.class);
My assumption is that it is a limitation of the Solr itself.
I was thinking about to write a custom fragmentsBuilder but I do not know exactly if it is the case nor how to do that. For another workaround I was thinking to execute for each word a highlight query, then to store the result, then to execute for the second word another highlight query, store the result and so on. But I don't think it is a good and elegant solution because I will have problems when executing the second query if the second word is: "em" or "class" or "red"/"green" (nested undesired highlighting will occur).
I am using spring-data-solr into a Spring Boot application and Solr 6.6.5 as a (http) service.
Does anyone know how to solve this? Please give me an advice! Any idea will be much appreciated!