-2

when I read the file it's coming out with all very odd characters special characters I don't understand. How do I get python to read the file in the same way notepad++ does?

the text file came from exporting an email from lotus notes 9 to unicode text.

yoshiserry
  • 20,175
  • 35
  • 77
  • 104
  • can you share some of the text that looks good on notepad++ and not otherwise. – g4ur4v Sep 16 '14 at 21:19
  • No, but I can tell you it ends up looking like a different language after I read it in using python and write it to a file? 㤀㨀㐀㄀㨀㄀  ⬀㄀   ഀ ! Nul RecievedL: from Spool/local/ – yoshiserry Sep 16 '14 at 21:25

1 Answers1

0

"Unicode" mode on Windows generally means UTF-16LE with a byte-order marker (BOM). If you're on Python 2.X, open the file with codecs.open(filename, encoding='utf-16') as described in the Unicode How-To section on reading Unicode data. If you're on 3.x, you can just use open(filename, encoding='utf-16').

Writing it out again will depend on what encoding you're trying to write to.

Peter DeGlopper
  • 36,326
  • 7
  • 90
  • 83
  • thanks Peter what are text files normally encoded with? – yoshiserry Sep 16 '14 at 21:55
  • Totally depends on where they come from. `utf-8` is fairly common, since it's a superset of ascii. But you have to know what your tools are expecting, if you're looking to write the contents back out. – Peter DeGlopper Sep 16 '14 at 21:57