The upload_to
parameter of FileField
accepts a callable, and the string returned from that is joined to your MEDIA_ROOT
setting to get the final filename (from the documentation):
This may also be a callable, such as a function, which will be called to obtain the upload path, including the filename. This callable must be able to accept two arguments, and return a Unix-style path (with forward slashes) to be passed along to the storage system. The two arguments that will be passed are:
instance
: An instance of the model where the FileField is defined. More specifically, this is the particular instance where the current file is being attached. In most cases, this object will not have been saved to the database yet, so if it uses the default AutoField, it might not yet have a value for its primary key field.
filename
: The filename that was originally given to the file. This may or may not be taken into account when determining the final destination path.
Additionally, when you access model.my_file_field
, it resolves to an instance of FieldFile
, which acts like a file. So, you should be able to write an upload_to
like the following:
def hash_upload(instance, filename):
instance.my_file.open() # make sure we're at the beginning of the file
contents = instance.my_file.read() # get the contents
fname, ext = os.path.splitext(filename)
return "{0}_{1}{2}".format(fname, hash_function(contents), ext) # assemble the filename
Substitute the appropriate hash function you'd like to use. Saving to the disk isn't necessary at all (in fact, the file is often already uploaded to temporary storage, or in the case of smaller files just kept in memory).
You'd use this like:
class MyModel(models.Model):
my_file = models.FileField(upload_to=hash_upload,...)
I haven't tested this yet, so you might have to poke at the line that reads the whole file (and you may want to just hash the first chunk of the file to prevent malicious users from uploading massive files and causing DoS attacks). You can get the first chunk with
instance.my_file.read(instance.my_file.DEFAULT_CHUNK_SIZE)
.