2

I have a Model containing a FileField. I'd like this FileField to have a unique path.

At first, at though about using the ID of the entry, but Django move the file to it's upload_to path before saving the entry, so ID is empty.

Moreover, I can't use something like the title or any other elements of the model (except the creation date) since they can be changed by the user. And I prefer not to copy/delete a file every time a user change the title of it's entry (if I use the title as a part of my path).

Here start my research, I found these :

  • Generate a unique key and compare it to the database. While the key exist, we generate a new one (Django, unique field generation) : The problem is the potentials hits the while could do to the database before having a unique key

  • Getting the timestamp from the creation date. The problem here, is if two people add a file at the exact same time, it will generate conflicts

I'd like to have this unique ID a small as possible, max length of 7 would be great. The perfect solution would have been to have the ID of the entry. Do you know a workaround to do so (calling save() before moving files to their upload_to folder?) or if not, which implementation would be the best, based on one of my solutions or a one you think is better?

Community
  • 1
  • 1
Cyril N.
  • 38,875
  • 36
  • 142
  • 243

2 Answers2

1

Since FileField is, by default, with null=True, blank=True, a possibility is to save the model twice, first by removing the file value (=None), saving, and then, adding the the file value (that was previously stored in a temporary var), and save again.

Here is the code

# this method goes in your model
def save(self, *args, **kwargs):
    # ignoring the double save if this is an update or if there is no files
    if self.pk or not self.file:
        return super(MyModel, self).save(*args, **kwargs)

    old_file = self.file._file
    self.file = None

    super(MyModel, self).save(*args, **kwargs)

    self.file = old_file

    return super(MyModel, self).save(*args, **kwargs)

Well, of course, this will result in a double request to the database when you create a new entry, but any other solution I could come up with required to do at least one hit to the database (key based with unique constraint, key based on the creation date, etc).

Hope this helps!

Cyril N.
  • 38,875
  • 36
  • 142
  • 243
  • I am facing the same problem... Why don't you use as a unique identifier UNIX time in milliseconds? (or in nanoseconds if you think you will have a conflict of exact same time submission?) $ date +%s%3N will give you the unix time in milliseconds (nano seconds truncated to the last 3 significant digits). Also, this post could be of help: http://stackoverflow.com/questions/1259219/django-datefield-to-unix-timestamp – pebox11 Jun 13 '15 at 12:24
  • Now that I rediscover this question, I'd answer myself by suggesting to override the save method from Django by saving first, then moving the file, which would have solved my issue easily and properly ;) – Cyril N. Jun 15 '15 at 07:02
0

The first option you gave would be a great choice if you weren't trying to minimize the size of the field. If you have an int64 ID that goes up to 2^64 - 1 (64 bit integer), you'd have to have more than a trillion entries (~ 2^40) in your DB before you could start worrying about the while making more than 1 query. The second is also really unlikely to happen if you are using datetime and you could prevent that with the same while loop by manipulating the date if it collided.

One option is to have a unique ID generator. You can store it in your DB and when you are going to create the FileField, you make one query to fetch the last id generated, increment one and save it again to the db.

Another option is implementing the FileField by hand. Instead of using FileField, use something like a CharField and manipulate the file cohersions and paths yourself.

Piva
  • 982
  • 6
  • 13
  • Sorry I forgot to mention the path will be used in the urls. That's why I'd like to have the smallest size in the unique ID. I think I'm gonna use something based on the date and reducing it (like in hexa, or fully a-z : I take the next two value, if there are > 26, I take just one. I convert them in their alphabet equivalent, and continue until the end. What do you think? – Cyril N. Feb 11 '11 at 08:36