5

Seems there are a lot of posts on this subject and my solution is in line with what the most common answer seems to be, however I'm encountering an encoding error that I don't know how to address.

>>> def Excel2CSV(ExcelFile, SheetName, CSVFile):
     import xlrd
     import csv
     workbook = xlrd.open_workbook(ExcelFile)
     worksheet = workbook.sheet_by_name(SheetName)
     csvfile = open(CSVFile, 'wb')
     wr = csv.writer(csvfile, quoting=csv.QUOTE_ALL)

     for rownum in xrange(worksheet.nrows):
         wr.writerow(worksheet.row_values(rownum))

     csvfile.close()

>>> Excel2CSV(r"C:\Temp\Store List.xls", "Open_Locations", 
              r"C:\Temp\StoreList.csv")

Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
Excel2CSV(r"C:\Temp\Store List.xls", "Open_Locations", r"C:\Temp\StoreList.csv")
File "<pyshell#1>", line 10, in Excel2CSV
wr.writerow(worksheet.row_values(rownum))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 14:
ordinal not in range(128)
>>>

Any help or insight is greatly appreciated.

MrBubbles
  • 405
  • 2
  • 7
  • 19

3 Answers3

11

As @davidism points out, the Python 2 csv module doesn't work with unicode. You can work around this by converting all of your unicode objects to str objects before submitting them to csv:

def Excel2CSV(ExcelFile, SheetName, CSVFile):
     import xlrd
     import csv
     workbook = xlrd.open_workbook(ExcelFile)
     worksheet = workbook.sheet_by_name(SheetName)
     csvfile = open(CSVFile, 'wb')
     wr = csv.writer(csvfile, quoting=csv.QUOTE_ALL)

     for rownum in xrange(worksheet.nrows):
         wr.writerow(
             list(x.encode('utf-8') if type(x) == type(u'') else x
                  for x in worksheet.row_values(rownum)))

     csvfile.close()
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • In Python3 (after updating the range), I get this error: **TypeError: a bytes-like object is required, not 'str'** – PM0087 Feb 26 '19 at 08:46
  • this tripped me up as well for a minute. The line `csvfile = open(CSVFile, 'wb')` should read `csvfile = open(CSVFile, 'w')`. It's one of the annoying little 2 - 3 translations. – EconomySizeAl Aug 20 '19 at 15:45
2

The Python 2 csv module has some problems with unicode data. You can either encode everything to UTF-8 before writing, or use the unicodecsv module to do it for you.

First pip install unicodecsv. Then, instead of import csv, just import unicodecsv as csv. The API is the same (plus encoding options), so no other changes are needed.

davidism
  • 121,510
  • 29
  • 395
  • 339
1

Another fashion for doing this: cast to string, so as you have a string, you may codify it as "utf-8".

str(worksheet.row_values(rownum)).encode('utf-8')

The whole function:

def Excel2CSV(ExcelFile, SheetName, CSVFile):
     import xlrd
     import csv
     workbook = xlrd.open_workbook(ExcelFile)
     worksheet = workbook.sheet_by_name(SheetName)
     csvfile = open(CSVFile, 'wb')
     wr = csv.writer(csvfile, quoting=csv.QUOTE_ALL)

     for rownum in xrange(worksheet.nrows):
         wr.writerow(str(worksheet.row_values(rownum)).encode('utf-8'))

     csvfile.close()
evinhas
  • 179
  • 1
  • 4