-1

Given a txt file with non-unicode text, I am able to detect its charset as 1251. Now, I would like to convert into unicode.

byte[] bytes1251 = Encoding.GetEncoding(1251).GetBytes(File.ReadAllText("sampleNU.txt"));
String str = Encoding.UTF8.GetString(bytes1251);

This doesn't work.

Is this the way to go about it for non-unicode to unicode conversion?

After trying the suggested approach on the RTF file, I get the below dialog when I try to open the output RTF file. Please let me know what to do because selecting Unicode doesn't make it readable or give the expected text?

enter image description here

John
  • 693
  • 1
  • 12
  • 37

1 Answers1

2
// load as charset 1251
string text = File.ReadAllText("sampleNU.txt", Encoding.GetEncoding(1251));

// save as Unicode
File.WriteAllText("sampleU.txt", text, Encoding.Unicode);
Alexander Petrov
  • 13,457
  • 2
  • 20
  • 49
  • Thank you for your prompt response! When I try your code with RTF file, I get the attached dialog. Is there anything else that I need to do? – John Aug 13 '16 at 13:57
  • 1
    @Holly - rtf ([rich text format](https://en.wikipedia.org/wiki/Rich_Text_Format)) is not txt (plain text). What you really want to do? – Alexander Petrov Aug 13 '16 at 14:07
  • @Holly: You should also say what's creating that dialog - we don't know what application is trying to open the file. – Jon Skeet Aug 13 '16 at 14:08
  • @AlexanderPetrov Sorry, I have mentioned txt file earlier. It needs to convert rtf non-unicode to rtf unicode. – John Aug 13 '16 at 14:11
  • @JonSkeet Thank you Jon! MS Word creates that dialog when I open the output rtf file. – John Aug 13 '16 at 14:11
  • @Holly: Okay, so if you specify that it's Unicode in the dialog box, does it then do the right thing? That would at least suggest the text has been converted properly. I don't know what rules Word uses to determine whether or not to show the dialog... – Jon Skeet Aug 13 '16 at 14:13
  • @JonSkeet Thank you Jon! The dialog has the bottom section that shows the preview of the text. That's the output like. Not sure how to go about it. – John Aug 13 '16 at 15:38
  • @Holly: Well that looks like valid RTF to me. It's not clear what you'd expect it to be. – Jon Skeet Aug 13 '16 at 23:22