3

I have written a method to return a string containing Chinese characters.

public printChineseMenu(){
   StringBuffer buffer;
   buffer.append(chinese string returned from DB);     //chinese characters appear in SQL
   System.out.println(buffer);                         //they appear as question marks
   PrintStream out = new PrintStream(System.out, true, "UTF-8");
   out.println(buffer);                                //chinese characters appear

   return (buffer.toString())
}

Is there a better type to store/return a Chinese character string than StringBuffer class

Duncan Jones
  • 67,400
  • 29
  • 193
  • 254
bouncingHippo
  • 5,940
  • 21
  • 67
  • 107
  • 2
    Please don't use StringBuffer, it was replaced by StringBuilder ten years ago. – Peter Lawrey May 13 '14 at 14:09
  • It's always nice to include code that compiles in your question. What you've written is broken in a few ways. Try to produce a concise example that actually demonstrates the problem: see http://stackoverflow.com/help/mcve – Duncan Jones May 13 '14 at 14:10
  • @Peter does StringBuilder have special encoding to retain Chinese characeters – bouncingHippo May 13 '14 at 14:10
  • @bouncingHippo: No, it doesn't need to. – Jon Skeet May 13 '14 at 14:10
  • 1
    @ginz Please don't edit code in someone's question. [Your edit](http://stackoverflow.com/review/suggested-edits/4800993) should have been rejected. – Duncan Jones May 13 '14 at 14:11
  • @bouncingHippo StringBuilder and StringBuffer and String all use UTF-16 encoding which supports all Chinese characters. – Peter Lawrey May 13 '14 at 14:13
  • @Duncan this edit fixed the code in the way, that don't change it's meaning, why is it incorrect? – Dmitry Ginzburg May 13 '14 at 14:14
  • @ginz See [When is it appropriate to edit someone else's code?](http://meta.stackexchange.com/questions/101583/when-is-it-appropriate-to-edit-someone-elses-code). The general consensus here is that we leave the authors code alone, except for indentation. – Duncan Jones May 13 '14 at 14:17
  • @Duncan the author's question was about storing the character, not about the correctness of the code above. So, this code might (should) be corrected. – Dmitry Ginzburg May 13 '14 at 14:18

2 Answers2

4

The problem here isn't StringBuffer - it's simply the encoding used by System.out. You'd find the exact same behaviour when printing the string directly, without using a StringBuffer.

StringBuffer (and its more modern, non-thread-safe equivalent, StringBuilder, which you should use instead) don't care about encoding themselves - they just use sequences of UTF-16 code units. They will correctly preserve all Unicode data. The same is true for String.

Your method should almost certainly just return a String - but if you don't need to do any "building" with the string (appending other pieces) then there's no point in using either StringBuffer or StringBuilder. If you do need to build up the reslut string from multiple strings, you should be fine to use either of them, and just return the result of toString() as you are already doing (although the brackets around the return value are irrelevant; return isn't a method).

Consoles can often be misleading when it comes to string data. When in doubt, print out the sequence of UTF-16 code units one at a time, and then work out what that means. That way there's no danger of encodings or unprintable characters becoming an issue.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
3

Your best option is to return a String. This is because a String is immutable and can store more information than a single character.

When you print text you need to ensure you write data using the same encoding as what ever is trying to read it expects. For example, if you redirect the output to a file and your reader expect UTF-8 encoding, that is how you would write it.

The problem with System.out used alone is that it doesn't write chars but instead it writes byte and assumes an encoding which might not be what you need.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130