0

The function returns places in radius using the Google Places API. To be exact, I use this library to handle the task.

The problem is that cyrillic symbols are shown like this:

ÐО Сбербанк РоÑÑии, КиевÑкое отделение â„–14

I tried these suggestions. I also tried this:

pname = place.name
uni = unicode(place.name)

And this:

convertedname = pname.encode(encoding='UTF-8', errors='strict')

Nothing helped. What else can I try?

Community
  • 1
  • 1
Elena
  • 149
  • 1
  • 3
  • 13
  • What is your wanted output? – Vincent Beltman Nov 27 '14 at 10:28
  • ah, I should have mentioned. Like that: Інтеграл Банк, Український індустріал, etc. The function should be able to return Russian, Ukrainian and English – Elena Nov 27 '14 at 10:35
  • What are the actual bytes that you are trying to print? – tripleee Nov 27 '14 at 10:38
  • Like that: list(bytearray("надра")) [208, 189, 208, 176, 208, 180, 209, 128, 208, 176]. And why does it give more bytes in output than letters in string? – Elena Nov 27 '14 at 10:57
  • That's not a very good resource you are linking to in the question, by the way. There's a reason the only answer has a downvote. This is not a very uncommon question; you should easily find dozens of duplicates with better answers. – tripleee Nov 27 '14 at 11:26
  • Also, obligatory reading: http://nedbatchelder.com/text/unipain.html – tripleee Nov 27 '14 at 11:27
  • totally agree that it's not the link to be trusted, actually tried a lot of other suggestions. Thanks for the link, I will read it for sure – Elena Nov 27 '14 at 12:43

2 Answers2

0

list(bytearray("надра"))

[208, 189, 208, 176, 208, 180, 209, 128, 208, 176]

That's UTF-8. If your output terminal is set up for UTF-8, you should basically need no encoding or decoding at all. But the proper way to read that string is to use string.decode('utf-8') to turn it into a proper Unicode string, then encode it before output to whatever encoding your terminal supports (looks vaguely like ... code page 1250 or iso-8859-2?).

https://tripleee.github.io/8bit#0xd0 shows 208 (0xD0) being mapped to Đ in six different encodings, so I conjecture that you are using one of those. The rest is speculation on my part.

So, basically,

pname=place.name.decode('utf-8')

Apparently, you also need to encode it to some suitable output encoding for your console, or set up your console to properly support UTF-8. If indeed your terminal is presently set up for cp1250, it does not support Cyrillic output at all.

tripleee
  • 175,061
  • 34
  • 275
  • 318
0

My terminal and browser encoding is utf-8 and the problem was while displaying text in browser. The problem was solved after i uncommented lines in webapp2 .py file:

path = os.path.join(os.path.dirname(file), 'index.html')
self.response.out.write(template.render(path, template_values)) 

which consider templating and such stuff. Your answer helped me to come to the solution. Thanks!

tripleee
  • 175,061
  • 34
  • 275
  • 318
Elena
  • 149
  • 1
  • 3
  • 13