1

I rarely post but read stack overflow daily, however today an issue has presented itself that has me a bit stumped and I'm hoping to get some help.

Main problem: For Spanish speaking users of our application, when making a POST call the returning JSON response includes strange square characters.

Example response:

{
   ...,
    "designation": "Ãblabla Blabllabla",
   ...,
    "objects": [ ... ]

}

Initially my thought was that there was some encoding problem. I tried setting the @Produces annotation at the method level and at the application.properties level based on this post: Spring Boot encoding / special characters

In the pom.xml file encoding is set to UTF-8.

        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>

Neither of the things I've tried have done anything and I'm starting to suspect that I'm barking up the wrong tree here. I'm not really sure what other information to add at this point, please let me know if there is anything missing and I'll try to add more detail.

kensai01
  • 65
  • 1
  • 3
  • 9
  • 1
    Your example contains no "strange square characters". Is it supposed to? – access violation Aug 16 '22 at 02:16
  • https://stackoverflow.com/a/36688383 – Rana_S Aug 16 '22 at 02:25
  • @accessviolation yes it does. You'll see them if you click Edit. One of Stack Overflow's charming quirks. – Dawood ibn Kareem Aug 16 '22 at 03:30
  • @accessviolation I see what you mean. Not sure how to make stackoverflow not filter it out for display... as Dawood ibn Kareem suggested when I hit edit I do see the square character there following the à character. – kensai01 Aug 16 '22 at 12:25
  • 1
    A possible explanation is that the characters are fine, but the device / browser / console / whatever on which you are trying to display them doesn't have a glyph in its display font that matches the character. The "strange square character" could be the glyph it substitutes. – Stephen C Aug 16 '22 at 12:41
  • The A with the tilde above it should actually be this character: Á however it seems that it's being interpreted by postman as à followed by a square (missing glyph substitute). – kensai01 Aug 16 '22 at 12:50

1 Answers1

0

I was able to figure out the issue. Had to read up on encoding and charsets a bit to get the idea but this site shows what was happening.

https://www.cogsci.ed.ac.uk/~richard/utf-8.cgi?input=%C3%81&mode=char

The character Á was UTF-8 bytes as Latin-1 characters bytes which came across in postman as à followed by the 'unknown glyph' square.

The fix was setting the string to the correct charset of ISO-8859-1.

new String(designation.getBytes("ISO-8859-1"));
kensai01
  • 65
  • 1
  • 3
  • 9