-1

I'm trying to read some information from an excel-file using the xlrd-module. This works fine most of the time, but when the script encounters any scandinavian letters the script stops. I've been reading several posts about unicode and encoding, but I must admit I'm not familiar with it.

The cell I'm reading contains text (string) and is being read as unicode (as normal with xlrd). One example of a value that fails is Glørmestervej and it is read by xlrd as u'Gl\xf8mestervej. If I try to print the variable, the script stops. I've had most success by encoding the value with latin1:

print cellValue.encode("latin1")

which gives the result Glormestervej, but with a KeyError.

How do I get the variable to become a string with ø instead of \xf8? The reason is that I need to use it as an input to another service and it does not seem to work using unicode.

Regards, Torbjørn

  • 1
    `u'Gl\xf8mestervej' == u'Glømestervej'` – Stop harming Monica Oct 18 '17 at 14:10
  • You may find this article helpful: [Pragmatic Unicode](http://nedbatchelder.com/text/unipain.html), which was written by SO veteran Ned Batchelder. BTW, if you migrate to Python 3 you'll get much saner Unicode handling. – PM 2Ring Oct 18 '17 at 14:25
  • 1
    With `s=u'Gl\xf8mestervej'` then `print s` _should_ work correctly to output `Glømestervej`, if your terminal is using an encoding (and font) that can handle `ø` . But you can also encode it explicitly with `print s.encode('utf-8')`, assuming that your terminal is correctly set to use UTF-8 encoding. – PM 2Ring Oct 18 '17 at 14:32
  • @PM2Ring Thank you for the article. Unfortunatly it's not possible to migrate to Python 3 at the moment, since ArcGIS (in which the script is run) doesn't support it yet. By using encode('utf-8') print displays `GlA,rmestervej`, and without encoding the program just stops. No exception is thrown. – Torbjørn Dallokken Oct 18 '17 at 14:34
  • 1
    "just stops" is odd. Assuming Windows, `print u'Gl\xf8metervej` should raise a traceback if the character isn't supported in the default encoding. I installed `xlrd`, created a worksheet with that word in it, read it with `xlrd`, and printed it and got a Traceback. I switched the console encoding (via the shell `chcp` command in Windows) to one that supported `ø` and it printed correctly. The problem must be in code not shown. – Mark Tolonen Oct 18 '17 at 16:29
  • Thanks for you comments, @MarkTolonen. I've found something interesting. I've been using Visual Studio Python Development Tools for creating the script and debugging and the output from here is wrong. Now I've tried running the script in IDLE (included with ArcGIS) and the script runs correct. – Torbjørn Dallokken Oct 19 '17 at 06:43

1 Answers1

0

I'm happy to say the problem have been solved, in fact there were not any error after all. There were some permission-issues with the user that I used for calling the service in which the variable was used. Thank you for your response!