How to index special characters in Solr

Question

I have a list of special characters, which needs to be indexed. How can I include these characters in my Solr search? What configurations need to be done in Schema.xml file of Solr?

List of Characters:

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

Quick help would be appreciated.Thanks!

score 0 · Answer 1 · answered Jul 10 '18 at 08:02

There is nothing special you have to do to use those values for querying or indexing them; but you have to decide how they should be used.

If you have a Tokenizer that tokenizes on word boundaries, these special characters will mean that the Tokenizer can decide that it separates two tokens, and thus, not index it.

If you use a tokenizer that doesn't do anything special with those characters, they'll be available just the same as any other character. You'll need to escape them if your library doesn't do that for you - but that depends on how you're querying Solr.

A string field won't do anything with the input tokens, and any value would retain its special characters in one single token without splitting it further.

The requirement is "I should not escape any of the those characters at the time of querying." The escaping should happen from the configuration side. Is that possible? — sanjeeda, Jul 19 '18 at 09:14
No. Characters have special meaning depending on what you want them to mean. The _user_ does not have to consider this, as you'll do the proper escaping in your middle layer when sending the query to Solr. — MatsLindh, Jul 19 '18 at 12:06

How to index special characters in Solr

1 Answers1