3

I'm new to django and try to realize an project that allows the user to upload a file, parses it and enters the contained information into the same model:

class Track(models.Model):  
    Gpxfile  = models.FileField("GPS XML", upload_to="tracks/gps/")
    date=models.DateTimeField(blank=True)
    waypoints = models.ForeignKey(Waypoint)
    ...

For the start I'm ok to work with the admin interface and to save work. So I hooked into the models save() method:

def save(self, *args, **kwargs):
    """we hook to analyse the XML files"""
    super(Track, self).save(*args, **kwargs) #get the GPX file saved first
    self.__parseGPSfile(self.Gpsxmlfile.path) #then analyse it

But here I run into problems due to the dependency:

  • to get the filefield saved to a real file, I need to invoke the original save() first
    • this breaks as some fields aren't populated yet as I didn't
  • if I switch both lines, the file isn't saved yet and can't be parsed

Maybe I have just a lack of basic knowledge but even after reading a lot of SOs, blogs and googeling around, I don't have a clear idea how to solve it. I just found this ideas that don't seem to fit very well:

  • create your own view and connect it to your handler (see here)
    • bad as this won't work with admin interface, needs views, puts logic to views etc...
  • use a validator for the filefield (see here)
    • not sure if this is a good design, as it's about file processing in general and not realy validating (yet)

So what does the community suggest to realize a file postprocessing and data "import" in Django 1.4?

MaM
  • 2,043
  • 10
  • 20
  • Why do you need to save the file? It can be parsed in memory and I think that is exactly your parsing function does (open the file, load it to memory and, finally, parse it). – Cartucho Apr 09 '14 at 12:56
  • Later people should download the file, but this is just the plan in long run. Can you please provide a snippet on how you would gain access to the filestream and that works also on bigger files (~5MB)? – MaM Apr 09 '14 at 13:25

1 Answers1

4

You can parse the file prior to saving, and generally I like to do this in the model clean() method:

def clean(self):
    file_contents = self.Gpxfile.read()
    ...do stuff

If the file doesn't meet your validation criteria, you can raise a ValidationError in clean which will propagate back to the calling view so you can report the form errors back to the user.

If you really need to save the file first and then do something, you could use a post_save signal

def some_function_not_in_the_model(sender, **kwargs):
    obj = kwargs['instance']
    ...do stuff with the object

# connect function to post_save
post_save.connect(some_function_not_in_the_model, sender=Track)

Django docs on post_save

Finally, one note about large files is that they may end up as temporary files on the server (in Linux /var/tmp or similar...this can be set in settings.py). It may be a good idea to check this within the clean() method when trying to access the file, something like:

# check if file is temporary
if hasattr(self.Gpxfile.file, 'temporary_file_path'):
    try:
        file_path = self.Gpxfile.file.temporary_file_path(),
    except:
        raise ValidationError(
            "Something bad happened"
        )
else:
    contents = self.Gpxfile.read()

Oh, and finally finally, be careful about closing the temporary file. Back when I started using Django's FileField and learned how the temporary file worked, I thought I'd be the good programmer and close the file after I was done using it. This causes problems as Django will do this internally. Likewise, if you open the temporary file and raise a ValidationError, you will likely want to remove (unlink) the temp file to prevent them from accumulating in the temp directory.

Hope this helps!

Fiver
  • 9,909
  • 9
  • 43
  • 63
  • Thank you for the code and comparing the approaches, that helped a lot :) Unfortunatly, if I use the clean() idea, the self object hasn't an ID yet, so I can't pass it to objects that I create within my procedure (here: waypoints, see reversed Foreignkey in 1st posting). Is there a good way to solve it? – MaM Apr 10 '14 at 10:11
  • So my current solution is to encapsulate all the file stuff in a master class (GPXfileset) that makes use of the save() hook, and parses the file. So it has an ID already that can be passed to the created objects, but it doesn't break the null constraines as it doesn't has any fields that get passed empty. Anyway, thanks for all the help! – MaM Apr 10 '14 at 10:54
  • Glad to help, one method I've used before is to create a new attribute in the parent instance inside save (like an array of the dependent child model instances). Then, use the post_save signal to parse and save the children. – Fiver Apr 10 '14 at 13:10