Using solr query to get the emails. Query returned the keyword like this. Keyword I am getting ( ±êææ³ñ¶¶¿¼¶ ) and Required Keywords is ( ąęććłńśśżźś )
Asked
Active
Viewed 75 times
1
-
That seems like an encoding problem? Are you sure your data is in UTF-8? What is the configuration for the field? – MatsLindh Jun 18 '20 at 08:01
-
How can I check for the UTF-8 configuration ? – Emily Jun 18 '20 at 08:50
-
Start by checking what data is being returned in the Solr Admin's query interface - that will tell you if the problem is with how your application is processing data _after_ being returned from Solr, or if it's borked already when putting it into Solr. If it's borked in Solr as well, it might be because how it's being submitted to Solr, or it might be because of how Solr is processing it. – MatsLindh Jun 18 '20 at 09:05
-
Yes , I have checked the response from the solr query , the problem is in its response and this is occuring for polish character only. – Emily Jun 18 '20 at 10:42
-
If the characters are fubar in the admin interface as well, then they're being indexed in the wrong encoding (i.e. submitted as something that's not UTF-8) or they're being manipulated by an update processor you have defined for your field. The first alternative is the most likely one. Exactly why the encoding is wrong depends on how you're indexing data into Solr and how you're handling your encoding in that source. – MatsLindh Jun 18 '20 at 12:19
-
Try posting a new document to Solr straight from the command line and see how it is stored: `curl -X POST --data '[{"id":"some-id","somefield_txt":"text in polish"}]' 'http://localhost:8983/solr/your-core/update?commit=true'` – Hector Correa Jun 22 '20 at 18:45
1 Answers
1
The problem is with encoding. However you are extracting the text, apply encoding as the detected charset. You can specify the encoding in the metadata or you can convert the text to any encoding. Ex-
new String(targetString.getBytes(), "ISO-8859-2")
or
new String(targetString.getBytes(), "UTF-8")

Victor Marcus
- 146
- 9