I've done quite a bit of checking up on other questions and I'm still uncertain on the issue.
Here's my usage case:
I have an online shopping cart. Occassionaly, certain clients find the ordering process either too tedious, or there are some clients where an online order will not cut it, and they need an actual PDF estimate (quote) in order to purchase a product.
So I coded in a module that takes the shopping cart contents, and lays out neatly as a PDF estimate.
Now because this process only uses the cart contents, and nothing else is used, not even the database, I have to create a unique Estimate document number, so that should the client pay for the quote, they have a reference to use in their payment instruction.
The shopping cart currently generates a 5 digit cart ID, unique to each customer based on their session. I've taken this 5 digit cart ID, and I've then added UNIX time to it, which gives me a nice long number to use as the Estimate document number.
So I end up with something like this: 363821482812537 [36382 is the cart ID and 1482812537 is unix time at the time the PDF estimate was generated]
The problem with this is that it is too long, and WILL be an issue as bank payment references are limited. Ideally, I'd like to keep it to 10 characters or less.
I've decided to look at CRC32 to shorten the generated estimate numbers, and it seems capable of shortening the estimate number to an acceptable amount of characters.
But, can anyone shed some light on what kind of collision I might be up against?
Few things to consider:
Cart ID will always be 5 digits.
Unix time will always be 10 digits up until the year 2286.
[So we will always end up with 15 digits that needs to be encoded, and no more]
There is a safeguard in place, that if by some chance, a duplicate occurs, an error is thrown, and the the option is provided to retry and generate the estimate. This is done by the estimate saving to a filename matching the estimate number (or in this case, the CRC32 hash of the estimate number) - and then checking first to see if a filename with the hash exists.
Customers will for the moment not be allowed to generate estimates themselves, for reasons not important to my question. So it will only be admins who can generate estimates.
My concern is simple, will I find myself running into collisions very often with my 15 digit to CRC32 hash encoding, or is it going to be pretty rare to run into collisions?