0

In python i'm stuck with a couple of strings from french language with accents that I can't convert back to normal, e.g.:

word1 = 'install=C3=A9' # should be installé
word2 = 'transf=E9r=E9' # should be transféré
word3 = 'bient=C3=B4t'  # should be bientôt

Most documentation I read specify to read the files with some encodings='utf-8' or so, but here I'm stuck with actual strings. Is there a way to decode the strings or should I build a maximega .replace() function ?

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
Waroulolz
  • 297
  • 9
  • 23

1 Answers1

4

The encoding seems to be Quoted Printable.

import quopri
word1 = 'install=C3=A9'
byteString = quopri.decodestring(word1)
string = byteString.decode('utf-8')
print(string)

Actually the function expects bytes as input, so it would be even better to have the words declared as bytes:

word1 = b'install=C3=A9'
Thomas Weller
  • 55,411
  • 20
  • 125
  • 222