Unknown output unicode in python 2

Question

I tried to implement this code from NLP UPC research group to retrieve synonyms for some entered words. when I ran the testing method

def test():
    "tests some functions"
    a=wn.get_words(True)
    print  'length of a: ', len(a)
    print 'a[0]: ', a[0].tostring().decode('utf-8')

the output is unknowing encoding

length of a:  16043
a[0]:  �����

in the same code the Unicode is already declared as

def _encode(data):
    return data.encode('utf8')

and the platform that I used (net beans 7.2.1)is configured to support utf-8 encoding

how to solve this problem?

Use `repr(a[0].tostring())` instead of `a[0].tostring().decode('utf-8')` and see what gets returned. — Blender, Jan 04 '13 at 12:28
thank you for your suggestion, but still have the same problem :( . the output is like this: Traceback (most recent call last): File "AWN.py", line 402, in test print 'a[0]: ', repr(a[0].tostring()) AttributeError: 'unicode' object has no attribute 'tostring' — Abrial, Jan 04 '13 at 16:30

score 1 · Answer 1 · answered Jan 04 '13 at 12:28

1

If you already configured your setup to handle UTF-8, you do not need to decode your string to a Unicode object. What will happen then is that Python uses the current encoding detected for sys.stdout.

Try not decoding:

print 'a[0]: ', a[0].tostring()

answered Jan 04 '13 at 12:28

Martijn Pieters

1,048,767
296
4,058
3,343

score 0 · Answer 2 · answered Jan 05 '13 at 06:10

0

thank you for the answers. I used this command instead and it's worked with me

print 'a[0]: ', a[0].encode('utf-8')

answered Jan 05 '13 at 06:10

Abrial

421
1
5
20

Unknown output unicode in python 2

2 Answers2