0

I am facing problem in passing strings given in a Webvtt / smptett file having Latin 1 supplements characters (2 bytes range 0080—00FF). In C I am storing them as unsigned characters and when I am trying to print the string I am getting the hex value of such characters. for example: feelíng is getting printed as faxing

Same string I am passing to Java layer via cpp where I am using NewStringUTF for converting the string in a string. But I am getting this error :- JNI DETECTED ERROR IN APPLICATION: input is not valid Modified UTF-8: illegal continuation byte 0x6e. This Error specially comes in lollipop version. In previous version this character was printed as some junk values on the screen. There is already reported as an android bug, but they have mentioned that the following error is coming with 4 bytes unicode characters. Can somebody please give any suggestions regarding this. I am really stuck with this problem...

Thennarasu
  • 234
  • 4
  • 20

1 Answers1

1

As the name suggests, NewStringUTF doesn't use Latin1.

You have following options:

  • convert strings from Latin1 to UTF-8 in your C++ code

  • exchange Latin1-encoded byte[]s instead of Strings and decode them on the Java side

  • convert your string to an array of jchars manually and use NewString:

    // the code below sucks and may be wrong
    jchar* tmp = new jchar[LENGTH OF STRING];
    for(size_t i = 0 ; i<LENGTH OF STRING; i++) {
        tmp[i] = (unsigned char) my_string[i];
    }
    NewString(env, tmp, LENGTH OF STRING);
    delete[] tmp;
    
  • And finally, a solution that will only work on Android, and probably not on all versions: there's an Android-only JNI function NewStringLatin1, which does exactly what you want.

Karol S
  • 9,028
  • 2
  • 32
  • 45