12

In console when I'm trying output Russian characters It gives me ???????????????

Who know why?

I tried write to file - in this case the same situation.

for example

f=open('tets.txt','w')
f.write('some russian text')
f.close

inside file is - ?????????????????????????/

or

p="some russian text"
print p
?????????????

In additional Notepad don't allow me to save file with Russian letters. I give this:

This file contains characters in Unicode format which will be lost if you save this file as an ANSI encoded text file. To keep the Unicode information, click Cancel below and then select one of the Unicode options from the Encoding drop down list. Continue?

How to adjust my system, so I will don't have this problems.

Alex Kulinkovich
  • 4,408
  • 15
  • 46
  • 50
Pol
  • 24,517
  • 28
  • 74
  • 95
  • 1
    This question's title is rather poorly chosen! – Carl Smotricz Jul 07 '10 at 21:03
  • Is it really `?` or rather `�`? – Gumbo Jul 07 '10 at 21:04
  • @Gumbo: the `?` is used when the target isn't able to *store* the given character because it's outside the charset range. E.g. databases and output (file/stdout/etc) writers. The `�` is used when the target is able to *display* the given character, but don't do because it's outside the range of the charset it is instructed to use. E.g. webbrowsers. All with all, it makes sense that `?` is been used here. – BalusC Jul 08 '10 at 13:06
  • 2
    @Carl - and I was going to suggest that the poster just make them tragic with undercurrents of brooding and mysterious. – Martin Beckett Jul 08 '10 at 14:37
  • @BalusC: it should rather throw an exception. @user375373: Notepad gives you the correct hints (both for using Notepad and for programming): choose a Unicode encoding such as UTF-16 (also called "Unicode" by Microsoft) or UTF-8. – Philipp Jul 08 '10 at 14:46
  • 1
    @Philipp: I wholeheartedly agree that, but unfortunately the truth is different at many places in many languages. Those "unknown" characters will simply be trashed or replaced. The target doesn't know "better". – BalusC Jul 08 '10 at 14:54
  • It seems that this user-somenumbers doesn't know how to accept answers or ignores it. – hgulyan Jul 08 '10 at 15:54
  • If one of these answers helped you solve your problem, please click on the check mark next to it so that the author gets proper credit. – bta Jul 08 '10 at 23:17

5 Answers5

19

Here is a worked-out example, please read the comments:

#!/usr/bin/env python2
# -*- coding: utf-8 -*-
# The above encoding declaration is required and the file must be saved as UTF-8

from __future__ import with_statement   # Not required in Python 2.6 any more

import codecs

p = u"абвгдежзийкл"  # note the 'u' prefix

print p   # probably won't work on Windows due to a complex issue

with codecs.open("tets.txt", "w", "utf-16") as stream:   # or utf-8
    stream.write(p + u"\n")

# Now you should have a file called "tets.txt" that can be opened with Notepad or any other editor
Philipp
  • 48,066
  • 12
  • 84
  • 109
  • I get the error: SyntaxError: non ASCI character '\xff' in file 'my python file', but no encoding declared – Pol Jul 08 '10 at 15:18
  • I did declare the encoding, that's what the second line is for. And there is no character '\xff' (which is `ÿ`) in the file. Are you sure that you did everything correctly, and do all characters show up correctly in Notepad? – Philipp Jul 08 '10 at 16:14
9

Try opening the file using codecs, you need to

import codecs

and then

writefile = codecs.open('write.txt', 'w', 'utf-8')
Hagge
  • 314
  • 1
  • 5
2

You need to define file encoding if it contains non-ASCII chars.

http://www.python.org/dev/peps/pep-0263/

petraszd
  • 4,249
  • 1
  • 20
  • 12
  • It not help! And even when I'm truing save Russian text in Notepad it says me that i cant save. Because i lose my data. – Pol Jul 08 '10 at 14:28
  • Follow the advice given by Notepad and choose one of the Unicode encodings. – Philipp Jul 08 '10 at 14:54
  • Please, just do not code in Notepad. It is really, REALLY bad idea. Try vim or emacs. If those two are too scary -- try Notepad++ or Scite or something more sane than Notepad. – petraszd Jul 08 '10 at 15:19
  • @petraszd: We are only using Notepad here because of its Unicode support. – Philipp Jul 08 '10 at 16:31
1

What console are you using? Chances are, your console doesn't support that language. Make sure that your console supports Unicode (and that your app is sending Unicode strings).

Update:

To address the update to your question regarding problems with Windows' Notepad: Click File->Save As, and then choose "Unicode" from the "Encoding" drop-down list.

bta
  • 43,959
  • 6
  • 69
  • 99
  • 1
    Which consoles did you try? What OS are you using? Can you successfully output Russian characters to your console using a programming language other than Python? – bta Jul 07 '10 at 21:20
  • I did not notice when it happaned. But it happend, after i install same extension like PYMSSQL and ODBC. Can it be out of it? – Pol Jul 07 '10 at 21:55
  • If the behavior changed after installing an extension, un-install the extension and see if the old behavior returns. It would not be unheard-of for an extension to introduce unexpected problems. – bta Jul 07 '10 at 22:35
  • @user375373: it would be really helpful if you answered bta's question. – Philipp Jul 08 '10 at 15:04
0

Are you typing in console too or only seing the results in console? This looks a pep-0263 problem as petraszd said.

print p.decode('your-system-encoding')

should work in console (I don't know what is the encoding system you use for Russian)

If you are using a .py file, you need to place # -*- coding: UTF-8 -*- (replacing utf-8 with Rusian encoding) on the top of the file and I think there is no need for the .decode in print if your OS is configured with the right encoding. (at least I don't need it but I don't know how it works with Russian)

laurent
  • 888
  • 8
  • 13
  • When i'm typin all is ok. When I put this # -*- coding: UTF-8 -*- I get the error: SyntaxError: non ASCI character '\xff' in file – Pol Jul 08 '10 at 15:14