I have a CSV file which is in "UTF-16LE" format. The file is in the German language so has a lot of German accents. I import it into GAE with Python using the Blobstore, create a list out of the CSV reader object, manipulate the list and output another list to CSV. When I create the list and run 'chardet' it shows as ASCII format. In addition, on outputting the file the accented characters disappear.
As a relative newbie to Python I am still learning and have read a lot about encoding and decoding. However, after spending an extremely long time on the problem I still cannot find a solution. Would really appreciate any help!
CODE BELOW:
TO IMPORT:
class MainHandler(webapp2.RequestHandler):
def get(self):
upload_url = blobstore.create_upload_url('/upload')
self.response.out.write('<html><body>')
self.response.out.write('<form action="%s" method="POST" enctype="multipart/form-data">' % upload_url)
self.response.out.write("""Entity: <input type="text" name="entity"><br>Upload File: <input type="file" name="file1"><br><input type="submit"
name="submit" value="Submit"> </form></body></html>""")
class UploadHandler(blobstore_handlers.BlobstoreUploadHandler):
def post(self):
upload_files = self.get_uploads('file1') # 'file' is file upload field in the form
blob_info = upload_files[0]
blob_reader = blobstore.BlobReader(blob_info.key())
blob_iterator = BlobIterator(blob_reader)
file = csv.reader((x.replace('\0', '') for x in blob_iterator),skipinitialspace=True, delimiter='\t')
for row in file:
if row:
file2.append(row)
--
AND CODE TO EXPORT:
self.response.headers['Content-Type'] = 'application/csv'
self.response.headers['Content-Disposition'] = 'attachment; filename=output.csv'
wr = csv.writer(self.response.out, quoting=csv.QUOTE_ALL)
for row in finallist:
wr.writerow(row)