1

In my django application users can upload their csv files to import data into django. It works fine for CLRF unicode files.

But there are two issues:

  1. When the file is not encoded with utf8 I keep getting 'utf8' codec can't decode byte 0xdc in position 393: invalid continuation byte. I tried to resolve that by using the following code

    file = codecs.EncodedFile(request.FILES['import'],"utf-8")  
    dialect = csv.Sniffer().sniff(file.read(2048))
    file.open() # seek to 0
    
    reader = csv.reader(file,dialect=dialect) 
    
  2. When the file uses CR Linebreaks they are not recognized or I get: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?. But the InMemoryUploadedFile is already an opened file object.

My issue is very similar to this one but the solution mentioned for point 1 didn't work for me (as you can see my code is very similar) and point 2 isn't answered at all:

Proccessing a Django UploadedFile as UTF-8 with universal newlines

Community
  • 1
  • 1
Daniel K.
  • 1,189
  • 1
  • 10
  • 26

0 Answers0