I'm working with Python 2.7 in a script that would allow me to separate words in chinese sentences (in which there are no spaces between words). I have many problems here that I guess are related to the encoding:
If I try to do this simple command on a script it works just fine, but on the shell I get:
>>> sentence= '我每天学习' Unsupported characters in input
For some sort of reason, whenever I remove characters from the end to the begining, when there's only a character left ('我') the character I get in its stead is ' 我 '.
The loop I'm using to shorten the sentence taking the last character each time would be this:
for i in range(num_characters/3):
temp= sentence[:num_characters-i*3]
where num_characters would be the number of characters times 3; and temp would be the new sentence I'm analyzing.
I'm using UTF-8 coding in the script and in theory IDLE is using UTF-8 as well, so I'm kind of lost. Any kind of help would be appreciated.