1

I need to perform highlighting for multiple words into the same field but for each one using a specific formatter (prefix and postfix).

Let's say that I have the description field and for a document it has the value: Einstein always excelled at math and physics from a young age. How to highlight math with a pair of a specific prefix and postfix AND ALSO physicswith a different prefix-postfix pair? So, in the end I would like to obtain:

Einstein always excelled at <em class="hl-red">math</em> and <em class="hl-green">physics</em> from a young age

The reason is that in the frontend I have different CSS classes with background-color: red; for hl-red and background-color: green for hl-green for example.

However, I was managed to highlight multiple words into the same field but with the same prefix-postfix pair all over the places, which is not what I want actually. In addition, I tried to add multiple HtmlFormatter entries in solrconfig.xml:

<highlighting> .............. <formatter name="html" default="true" class="solr.highlight.HtmlFormatter"> <lst name="defaults"> <str name="hl.simple.pre"><![CDATA[<em>]]></str> <str name="hl.simple.post"><![CDATA[</em>]]></str> </lst> <lst name="hl-red"> <str name="hl.simple.pre"><![CDATA[<em class="hl-red">]]></str> <str name="hl.simple.post"><![CDATA[</em>]]></str> </lst> <lst name="hl-green"> <str name="hl.simple.pre"><![CDATA[<em class="hl-green">]]></str> <str name="hl.simple.post"><![CDATA[</em>]]></str> </lst> </formatter> .............. </highlighting> but I got HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr: Unknown formatter: hl-green. Also, I didn't find a way to specify an array of prefixes in Solr Admin UI nor in spring-data-solr, just a simple query like this:

SimpleHighlightQuery query = new SimpleHighlightQuery(Objects.requireNonNull(criteria)); HighlightOptions highlightOptions = new HighlightOptions() .addFields(fields) .setSimplePrefix(prefix) .setSimplePostfix(postfix); query.setHighlightOptions(highlightOptions); query.setPageRequest(pageable); return solrTemplate.queryForHighlightPage(MY_CORE, query, MyModel.class);

My assumption is that it is a limitation of the Solr itself.

I was thinking about to write a custom fragmentsBuilder but I do not know exactly if it is the case nor how to do that. For another workaround I was thinking to execute for each word a highlight query, then to store the result, then to execute for the second word another highlight query, store the result and so on. But I don't think it is a good and elegant solution because I will have problems when executing the second query if the second word is: "em" or "class" or "red"/"green" (nested undesired highlighting will occur).

I am using spring-data-solr into a Spring Boot application and Solr 6.6.5 as a (http) service.

Does anyone know how to solve this? Please give me an advice! Any idea will be much appreciated!

Paul Stoia
  • 11
  • 1

0 Answers0