0

I have already tried convert byte cp1252 to byte utf8 but all is in vain.

For example: I have byte[] 0xB5(cp1252) and I want convert to byte[] 0xC3, 0xA0(utf8).

I want to like this: µ --> à.

My code but it is not working:

public void convert(){
  try {
      byte[] cp1252 = new byte[]{(byte) 0xB5};
      byte[] utf8= new String(cp1252, "CP-1252").getBytes("UTF-8");
      // values of utf8 array are 0xC2, 0xB5 not 0xC3, 0XA0 as I expected
  } catch (Exception ex) {
      System.out.println(ex.getMessage());
  }
}
Sơn Đàm
  • 29
  • 1
  • 6
  • 1
    any help from this http://stackoverflow.com/questions/12045581/encoding-cp-1252-as-utf-8 – Garry May 11 '15 at 08:22
  • 1
    You want to convert the `MICRO SIGN` into a `LATIN SMALL LETTER A WITH GRAVE`? This is not a normal `CP-1252` to `UTF-8` convertion. – SubOptimal May 11 '15 at 08:32
  • 1
    he probably wants to revert faulty encoding back to its correct state – specializt May 11 '15 at 08:38
  • 2
    @specializt Maybe. The requestor is the only one who could clarify this point. And `revert broken encoding` is much different to `convert from one codepage to another`. – SubOptimal May 11 '15 at 08:58
  • Yes, revert broken encoding is what I want to solve back then, that was solved. Just want to confirm with you guys. – Sơn Đàm Dec 20 '20 at 09:59

1 Answers1

2

You should use "Cp1252" as code page instead of "CP-1252"

public void convert(){
    try {
        byte[] cp1252 = new byte[]{(byte) 0xB5};
        byte[] utf8= new String(cp1252, "Cp1252").getBytes("UTF-8");
    } catch (Exception ex) {
        System.out.println(ex.getMessage());
    }
}

Java supported encodings

As pointed out 0xB5 you are trying to decode is not code page 1252 and above code will not give you result you seek.

If you run following code you will see that there is no encoding that will do conversion you want to do

    try {
        byte[] u = new byte[]{(byte) 0xC3, (byte) 0xA0};

        SortedMap m = Charset.availableCharsets();
        Set k = m.keySet();
        Iterator i = k.iterator();
        String encoding = "";
        while (i.hasNext()) {
            String e = (String) i.next();
            byte[] cp = new String(u, "UTF-8").getBytes(e);
            if (cp[0] == (byte) 0xB5)
            {
                encoding = e;
                break;
            }
        }
        System.out.println(encoding);
    } catch (Exception ex) {
        System.out.println(ex.getMessage());
    }
Dalija Prasnikar
  • 27,212
  • 44
  • 82
  • 159