Recently I had a strange issue with the Cp500 (EBCDIC) encoding during a transformation from bytes to String and then back from String to bytes.
The issue is that one specific character LINE FEED - LF - 0x25 is, during this transformation, being transformed to this character NEW LINE - NEL - 0x15.
Here the following code that validates this :
byte[] b25 = { 0x25 };
byte[] b4E = { 0x4E };
System.out.printf("\n0x25 in hex : <0x%02X>", b25[0]);
System.out.printf("\n0x4E in hex : <0x%02X>", b4E[0]);
String stringB25 = new String(b25, "Cp500");
String stringB4E = new String(b4E, "Cp500");
System.out.printf("\nOther way, 0x25 in hex : <0x%02X>", stringB25.getBytes("Cp500")[0]);
System.out.printf("\nOther way, 0x4E in hex : <0x%02X>", stringB4E.getBytes("Cp500")[0]);
Output :
0x25 in hex : <0x25>
0x4E in hex : <0x4E>
Other way, 0x25 in hex : <0x15>
Other way, 0x4E in hex : <0x4E>
In order to understand this behavior, I gave a look into the IBM500.java class, and I see that both 0x15 and 0x25 characters maps to the "\n" character.
What's the reason behind that?
Ultimately, is there a way to preserve the bytes input consistency between String encoding and decoding mechanism?