im trying to archive the following:
input: xls file output: csv file
I want to read the xls and do some manipulations (rewrite the headers (original: customernumer, csv needs Customer_Number__c), removing some columns, etc.
Right now I'm already reading the xls and try to write as csv (without any manipulations), but I'm struggling because of the coding. The original file contains some "special" characters like "/", "\", and most impoartant "ä, ü, ö, ß".
I get the following error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 8: ordinal not in range(128)
I have no clue which special characters can be in a file, this changes from time to time.
here is my current sandbox code:
# -*- coding: utf-8 -*-
__author__ = 'adieball'
import xlrd
import csv
from os import sys
import argparse
def main():
parser = argparse.ArgumentParser()
parser.add_argument("inname", type=str,
help="Names of the Input File in single quotes")
parser.add_argument("--outname", type=str,
help="Optional enter the name of the output (csv) file. if nothing is given, "
"we use the name of the input file and add .csv to it")
args = parser.parse_args()
if args.outname is None:
outname = args.inname + ".csv"
else:
outname = args.outname
wb = xlrd.open_workbook(args.inname)
xl_sheet = wb.sheet_by_index(0)
print args.inname
print ('Retrieved worksheet: %s' % xl_sheet.name)
print outname
output = open(outname, 'wb')
wr = csv.writer(output, quoting=csv.QUOTE_ALL)
for rownum in xrange(wb.sheet_by_index(0).nrows):
wr.writerow(wb.sheet_by_index(0).row_values(rownum))
output.close()
anything I can do here to make sure these special characters get written to the csv in the same way as they appeared in the original xls?
thanks
andre