0

There is a requirement to create a file with all utf-8 characters using Java or create all utf8 characters as string at runtime. I have tried to create the string by using below code.

 char data[] = new char[65536];
  for (int index = 0; index < 65536; index++) {
    data[index] = (char) index;
  }

But when I try to print this character as string on console, I see there are many '?' symbols. I am not sure if this is the right way to generate the utf8 character set. I read that utf character can be 1-4 bytes range. But java char is 2 bytes. I think there should be something that I am doing wrong in this case. I went through many links, but could not find the appropriate answer. Can someone help regarding this.

user2702700
  • 639
  • 2
  • 11
  • 26
  • There are no such things as "utf-8 characters"... UTF-8 is an encoding for Unicode which contains 100 000's of characters. See https://en.wikipedia.org/wiki/Unicode#Architecture_and_terminology – Usagi Miyamoto Aug 09 '17 at 04:28
  • As for the `?` characters, your console and/or its font is not capable to show you all Unicode characters, and try to visualize the missing ones with the `?` mark... – Usagi Miyamoto Aug 09 '17 at 04:30
  • Java uses Unicode internally. These are converted to UTF-8 when you write to a stream. – vikingsteve Oct 27 '17 at 08:32

0 Answers0