0

I have a CSV file which is in "UTF-16LE" format. The file is in the German language so has a lot of German accents. I import it into GAE with Python using the Blobstore, create a list out of the CSV reader object, manipulate the list and output another list to CSV. When I create the list and run 'chardet' it shows as ASCII format. In addition, on outputting the file the accented characters disappear.

As a relative newbie to Python I am still learning and have read a lot about encoding and decoding. However, after spending an extremely long time on the problem I still cannot find a solution. Would really appreciate any help!

CODE BELOW:

TO IMPORT:

class MainHandler(webapp2.RequestHandler):
  def get(self):

    upload_url = blobstore.create_upload_url('/upload')
    self.response.out.write('<html><body>')
    self.response.out.write('<form action="%s" method="POST" enctype="multipart/form-data">' % upload_url)
    self.response.out.write("""Entity: <input type="text" name="entity"><br>Upload File: <input type="file" name="file1"><br><input type="submit"
        name="submit" value="Submit"> </form></body></html>""")

class UploadHandler(blobstore_handlers.BlobstoreUploadHandler):
  def post(self):
    upload_files = self.get_uploads('file1')  # 'file' is file upload field in the form
    blob_info = upload_files[0]
    blob_reader = blobstore.BlobReader(blob_info.key())
    blob_iterator = BlobIterator(blob_reader)
    file = csv.reader((x.replace('\0', '') for x in blob_iterator),skipinitialspace=True, delimiter='\t')

    for row in file:
        if row:
            file2.append(row)

--
AND CODE TO EXPORT:

   self.response.headers['Content-Type'] = 'application/csv'
    self.response.headers['Content-Disposition'] = 'attachment; filename=output.csv'
    wr = csv.writer(self.response.out, quoting=csv.QUOTE_ALL)

    for row in finallist:
         wr.writerow(row) 
PAB445
  • 1
  • Why are you removing the `\0` bytes from your UTF-16 data? – Martijn Pieters Nov 05 '14 at 18:02
  • Otherwise I get a “Line contains NULL byte” error. I have found ways of re-encoding the files without using the blobstore but can't find ways of doing it when using blobstore – PAB445 Nov 06 '14 at 09:26
  • Hi, I asked this question again in the duplicate but got told to start a new question which I had previously done. Please can you not mark this as a duplicate. Thanks. – PAB445 Nov 18 '14 at 06:05

0 Answers0