0

I apologize in advance as I am not sure how to ask this! Okay so I am attempting to use a twitter API within Python. Here is the snippet of code giving me issues:

trends = twitter.Api.GetTrendsCurrent(api)
print str(trends)

This returns:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-5: ordinal not in range(128)

When I attempt to .encode, the interpreter tells me I cannot encode a Trend object. How do I get around this?

Charlie
  • 3
  • 1
  • 1
    Are you using Python 2 or 3? How 'bout just `print trends`? – Matt Ball Oct 16 '15 at 00:14
  • When I try to 'print trends' I get a similar error as above! UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-5: ordinal not in range(128) This is within python 2.7 – Charlie Oct 16 '15 at 00:15
  • Can you change your encoding from UTC-8 to Unicode? I suspect that you have a non-standard character in there. – Prune Oct 16 '15 at 00:32
  • @Prune do you mean using .encode? – Charlie Oct 16 '15 at 00:36
  • That's one way. There are also compiler directives to specify Unicode for the entire run, so you don't have to encode every character that doesn't fit the ASCII model. – Prune Oct 16 '15 at 00:42
  • .encode does not work as the data type is Trend not string, but I will try the compiler way and let you know Edit: Any idea how to do that in PyCharm? I am new! – Charlie Oct 16 '15 at 00:43

1 Answers1

0

Simple answer:

Use repr, not str. It should always, always work (unless the API itself is broken and that is where the error is being thrown from).

Long answer:

By default, when you cast a Unicode string to a byte str (and vice versa) in Python 2, it will use the ascii encoding by default for the conversion process. This works most of the time, but not always. Thus, nasty edge cases like this are a pain. One of the big reasons for the break in backwards compatibility in Python 3 was to change this behavior.

Use latin1 for testing. It may not be the correct encoding, but it will always (always, always, always) work and give you a jumping off point for debugging this properly so you at least can print something.

trends = twitter.Api.GetTrendsCurrent(api)
print type(trends)
print unicode(trends)
print unicode(trends).encode('latin1')

Or, better yet, when encoding force it to ignore or replace errors:

trends = twitter.Api.GetTrendsCurrent(api)
print type(trends)
print unicode(trends)
print unicode(trends).encode('utf8', 'xmlcharrefreplace')

Chances are, since you are dealing with a web based API, you are dealing with UTF-8 data anyway; it is pretty much the default encoding across the board on the web.

Community
  • 1
  • 1
eestrada
  • 1,575
  • 14
  • 24
  • ThePentium's answer above worked. My IDE was not using the correct encoding format. Thank you for your tips though! – Charlie Oct 18 '15 at 02:13