0

I have a solr 6.6.0 instance running and have indexed some docs - PDF and HTML. Previously I had solr 4 and searching with highlighting results was fine. Unfortunately this (default) behaviour seems to have disappeared in v6. The setup is the default one mentioned by the original solr tutorial. I played around with a lot of GET parameters but cannot geht highlighted content. I appreciate any hint or tipp to get this running. Am I missing some config changes or parameters?

E.g.

http://serv1:8983/solr/gettingstarted/select?wt=json&indent=true&q=betreten&hl=true&hl.method=unified

gives

{
  "responseHeader":{
    "zkConnected":true,
    "status":0,
    "QTime":152,
    "params":{
      "q":"betreten",
      "hl":"true",
      "indent":"true",
      "hl.method":"unified",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"maxScore":0.822483,"docs":[
      {
        "id":"/var/docs/2017/08/22/2319/page-1.html",
        "stream_size":[3820],
        "x_parsed_by":["org.apache.tika.parser.DefaultParser",
          "org.apache.tika.parser.html.HtmlParser"],
        "stream_content_type":["text/html"],
        "dc_title":["/var/docs/2017/08/22/2319/page-1.html (22.08.2017 23:19)"],
        "ocr_system":["tesseract 3.04.01"],
        "content_encoding":["UTF-8"],
        "content_type_hint":["text/html; charset=utf-8"],
        "resourcename":["/var/docs/2017/08/22/2319/page-1.html"],
        "title":["/var/docs/2017/08/22/2319/page-1.html (22.08.2017 23:19)"],
        "content_type":["application/xhtml+xml; charset=UTF-8"],
        "ocr_capabilities":["ocr_page ocr_carea ocr_par ocr_line ocrx_word"],
        "_version_":1576604407523442688}]
  },
  "highlighting":{
    "/var/docs/2017/08/22/2319/page-1.html":{
      "_text_":[]}}}

Thank you!

Aviator
  • 512
  • 3
  • 7

1 Answers1

1

Highlighter generally analyze stored text on the fly in order to highlight.

In your schema please check if _text_ is stored or not. If it is managed schema then _text_ may not be stored. Please check following _text_ config in managed-schema or schema.xml

<field name="_text_" type="text_general" multiValued="true" indexed="true" stored="false"/>

stored=false indicates that contents of _text_ are not stored. If you set stored="true" then _text_ will be stored and will be available for highlight.

Note: After changing schema.xml or managed-schema files,

  • you need to retstart Solr instance so that changes will be effective
  • Data need to be reloaded
Shubhangi
  • 2,229
  • 2
  • 14
  • 14
  • I did this in all managed_schema files found but same result. The problem of course sits in front of the machine but I cannot figure out how to fix me – Aviator Aug 24 '17 at 17:09
  • @Aviator, did you perform solr restart and reindexing after changing managed-schema file?(I have edited answer please check it) – Shubhangi Aug 24 '17 at 17:23
  • Sorry Shubhang, no joy after restarting. I changed the value in example/files/conf/managed-schema, server/solr/configsets/data_driven_schema_configs/conf/managed-schema and server/solr/configsets/basic_configs/conf/managed-schema, restarted and reindexed. – Aviator Aug 24 '17 at 18:21
  • Ohh.. Do you have access to Solr Admin UI ? Did you check files to see if changes are really done for collection? – Shubhangi Aug 24 '17 at 18:28
  • Whoops! It says stored="false", but why?!? – Aviator Aug 24 '17 at 18:30
  • You may copy managed-schema file content from Solr Admin UI files to local file managed-schema, then modify stored="true" and run /server/scripts/cloud-scripts/zkcli.sh -zkhost : -cmd putfile //configs//managed-schema Example: /usr/lib/ambari-infra/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost 192.168.1.17:2181 -cmd putfile /infra-solr/configs/TestCollection/managed-schema ~/config_files/managed-schema – Shubhangi Aug 24 '17 at 18:41
  • Above zkcl‌​i.sh command will modify managed-schema after which you need to restart solr and reindex. – Shubhangi Aug 24 '17 at 18:42
  • @Aviator , I am glad that issue is resolved. Please accept answer if it helped you. – Shubhangi Aug 25 '17 at 05:12