1

I'm having problems when uploading lots of files in Django. The context is the following: I've a spreadsheet with one or more columns being image filenames; those images are being uploaded through an form with input type=file and the option multiple.

With few lines - say 70, everything goes fine. But with more lines, and consequently more images, there's a IOError happening in random positions.

I've checked several questions about file/image upload in Django but couldn't find any that is related to my problem.

The model I'm using is the Product model of LFS (www.getlfs.com). We are developing a system that is based on LFS and to facilitate the creation of dozens of products in batch we wrote some views and templates to receive the main product properties through a spreadsheet. Each line is a product and the columns are the desired properties.

LFS uses a custom class ImageWithThumbsField(ImageField) to store the product's image and when saving the product instance (got from the spreadsheet), all thumbnails are generated. This is a time (cpu) consuming task, and my initial guess is that for some reason the temporary file is deleted before all processing had occurred.

Is there a way to keep these uploaded files for more time? Any other approach suggested to be able to process hundreds of uploaded files? Any hints on what can be happening?

Hope you can understand my question. I can post code if need.

Links to relevant portions of LFS code:

  • where thumbnails are generated:

https://github.com/diefenbach/django-lfs/blob/master/lfs/core/fields/thumbs.py

  • product model

https://github.com/diefenbach/django-lfs/blob/master/lfs/catalog/models.py

Thanks in advance!

momenezes
  • 111
  • 8
  • Could you please post a traceback for error you're getting? And how are you running your application? Via web-server or django's devserver also throws an error? – ilvar Mar 04 '12 at 02:41
  • During the next 4 four days, part of the traceback will be here http://dpaste.com/711195/ – momenezes Mar 04 '12 at 19:03

1 Answers1

1

It sounds like you are running out of memory. When django processess uploads, until the form is validated all of the files are either:

  • kept in memory inside the python/wsgi process/worker. (Usual mode of op for runserver)

    In this case, you are uploading enough photos to fill up the process memory and running out of space. This will be non-deterministic as to where the IOError happens as you can imagine (GC Dependent).

  • Temporarily stored in /tmp/ (usual setup of apache)

    In this case, the webserver's ramfs is full of images that have not yet been written to disk. In this case it should IOError arround the same place.

In either case, you should not be bulk uploading images in this way anyway. Apache/Django is not designed for it. Try uploading a single product/image per request/response, and all your problems will go away.

Thomas
  • 11,757
  • 4
  • 41
  • 57
  • +0.5 for pretty bold; +0.5 for me being drunk. Seems like an educated answer, too - but I can only offer 1 point. – dokkaebi Mar 04 '12 at 05:46
  • @Thomas: It seems that your answer is very close or really is the true fact. When PIL tries to read the file, it's a 0KB file, probably due to some of the causes you wrote. Do you think changing the temporary storage to disk could help? Thanks for your answer. – momenezes Mar 04 '12 at 19:09
  • @momenezes is it really necessary to upload all pictures in 1 request? I don't know of any way to do disk based upload in django. – Thomas Mar 05 '12 at 05:56
  • @Thomas We're thinking of refactoring the upload of images to take place one step before the spreadsheet processing. The user should upload all images to a folder (e.g., in batches of 40 imgs) in the server and then upload the spreadsheet. Anyway, thanks again for your comments and time. Best regards – momenezes Mar 05 '12 at 20:55