3

We are looking at creating promotional codes to send to customers, we have been told that each code sent HAS TO BE UNIQUE - 5 Characters - Alphanumeric.

I thought of doing a hash of a concatenated string and taking the first 5 characters of the hash, but there is a good chance that the same 5 characters will come up again and again.

Can anyone give me any pointers on creating this unique 5 character alpha numeric string that is unique EVERY TIME?

Richard Gale
  • 1,816
  • 5
  • 28
  • 45
  • 1
    Make a text file and append each code to it when it is made, then each time one is made keep generating them until its a unique one that isn't in the file – Alfie Goodacre Apr 14 '16 at 14:17
  • You have to store the existing somewhere and compare each created with all of them – fubo Apr 14 '16 at 14:19
  • That doesnt sound like a very clean solution Alfie, this could take a very long time to loop through to endsure uniqueness, and the file is only going to keep getting bigger with each code sent. – Richard Gale Apr 14 '16 at 14:19
  • @fubo, we have multiple stores, there could be an instance where the code is sent out simultaneously, so could still leave a margin for duplication (very small margin I know) – Richard Gale Apr 14 '16 at 14:20
  • This is basic. Use hexadecimal, 0-F for each digit. Store the last created one in the database. When you are generating a new number, simply iterate by 1 and store that. Access and update this number using the singleton pattern to guarantee it can only be done once at a time. But even then, with only 5 digits, you could run out fairly quickly depending on usage. – user1666620 Apr 14 '16 at 14:20
  • The easiest way you can guarantee that it's absolutely unique is to keep a log of which ones you've used. Generate a new key and check to see if it's already been used. Depending on how often you create these keys, you could use the current year as the first char, month as 2nd, day as third, hour as forth and min as fifth? – Scottie Apr 14 '16 at 14:21
  • @Scottie. Thanks. These codes will be generated ad-hoc. I see where you are coming from with the year, month day etc, but could still have overlap with other branches and customers. – Richard Gale Apr 14 '16 at 14:23
  • Is there no way in C# to generate a unique string? – Richard Gale Apr 14 '16 at 14:23
  • @Richard.Gale You might need to approach management and tell them the limitations of your setup and that "absolutely certainty of 100% unique is possible, but it will take a LOT of money and hardware. Instead, can we get by with 99% certainty?" – Scottie Apr 14 '16 at 14:23
  • First figure out a way to generate unique integers over all stores -- whether you do that by partitioning the values or using a central server is up to you. Once you have that, it's a matter of converting the integer to an alphanumeric code -- the first problem is the harder one. – Jeroen Mostert Apr 14 '16 at 14:23
  • @Richard.Gale if you want unique across multiple locations, either the code is generated by a single service shared by all the stores, or you allocate each a range of values they can choose from. – user1666620 Apr 14 '16 at 14:24
  • All of the codes are generated on the fly by the same system, my issue is with database calls and loops, that 2 simultaneous clicks could produce the same code - which I do not want – Richard Gale Apr 14 '16 at 14:27
  • @Richard.Gale you have heard of the singleton pattern before, right? that was developed for this exact scenario. you have your service, which gets hit simultaneously. However it only does one operation at a time, forcing everyone to wait their turn. – user1666620 Apr 14 '16 at 14:29
  • With database calls you're fine. Databases are designed to avoid problems like this, that's why transactions exist. A table with an `IDENTITY` column will never generate the same identity twice, for instance. Map the identity to an alphanumeric value and you're good to go. – Jeroen Mostert Apr 14 '16 at 14:29
  • 4
    Also, make sure your problem actually has a solution before you set out to create it. If you can use 26 characters + 10 digits (already problematic, since people may complain 1 and i are confusable, 0 and o, etcetera) this still leaves you with no more than 36^5 = 60 466 176 codes. Is that enough? For all time? For a month? For a week? For how many stores, customers, actions? When do you recycle codes and how? – Jeroen Mostert Apr 14 '16 at 14:30
  • 2
    @Richard.Gale: Is it possible to generate a list of sequential values, say 1,000,000 of them, store off in table, then have the system random select one of them and mark it "used"? Each next query would be limited to those not used, and get a new random number, etc. – Cᴏʀʏ Apr 14 '16 at 14:30
  • 1
    @Cᴏʀʏ why random? why not just iterate? – user1666620 Apr 14 '16 at 14:31
  • @user1666620: Just to give the illusion of randomness to those receiving the codes -- to make them less guessable. Maybe the system even generates 10,000,000 codes, marks 1,000,000 of them as "usable" at random, then selects from those that are not already taken. – Cᴏʀʏ Apr 14 '16 at 14:34
  • @Cᴏʀʏ the OP didn't say anything about random, just uniqueness. – user1666620 Apr 14 '16 at 14:36
  • I agree with user1666620. You can keep on the server a number that you increment at each request for ID. Convert it in base 36 (using 0,1,2,3,4,5,6,7,8,9, A, ..., Z) on 5 characters. Example 00000, ..., AAAAB – Jean-Claude Colette Apr 14 '16 at 18:22

4 Answers4

5

As I mentioned in the comments of my other answer, it may not be sufficient for your purposes. I worked up some more code which generates a string of random alpha-numeric characters. This time, they aren't limited to 0-9 and A-F -- i.e. the hexadecimal equivalents of randomly-generated nibbles. Instead, they are comprised of the full range of alpha-numeric characters, at least with upper-case letters. This should sufficiently increase the potential for uniqueness given we're going from 16 possible characters with hex to 36 possible characters with the full alphabet and 0-9.

Still, when I ran it over 10,000,000 tries, there were plenty of dups. It's just the nature of the beast: the likelihood that you'll get dups with such a short string is fairly high. At any rate, here it is. You can play around with it. If your client doesn't mind lowercase letters -- e.g. if "RORYAP" is different from "RoryAp" -- then that would even further increase the likelihood of uniqueness.

/// <summary>
/// Instances of this class are used to geneate alpha-numeric strings.
/// </summary>
public sealed class AlphaNumericStringGenerator
{
    /// <summary>
    /// The synchronization lock.
    /// </summary>
    private object _lock = new object();

    /// <summary>
    /// The cryptographically-strong random number generator.
    /// </summary>
    private RNGCryptoServiceProvider _crypto = new RNGCryptoServiceProvider();

    /// <summary>
    /// Construct a new instance of this class.
    /// </summary>
    public AlphaNumericStringGenerator()
    {
        //Nothing to do here.
    }

    /// <summary>
    /// Return a string of the provided length comprised of only uppercase alpha-numeric characters each of which are
    /// selected randomly.
    /// </summary>
    /// <param name="ofLength">The length of the string which will be returned.</param>
    /// <returns>Return a string of the provided length comprised of only uppercase alpha-numeric characters each of which are
    /// selected randomly.</returns>
    public string GetRandomUppercaseAlphaNumericValue(int ofLength)
    {
        lock (_lock)
        {
            var builder = new StringBuilder();

            for (int i = 1; i <= ofLength; i++)
            {
                builder.Append(GetRandomUppercaseAphanumericCharacter());
            }

            return builder.ToString();
        }
    }

    /// <summary>
    /// Return a randomly-generated uppercase alpha-numeric character (A-Z or 0-9).
    /// </summary>
    /// <returns>Return a randomly-generated uppercase alpha-numeric character (A-Z or 0-9).</returns>
    private char GetRandomUppercaseAphanumericCharacter()
    {
            var possibleAlphaNumericValues =
                new char[]{'A','B','C','D','E','F','G','H','I','J','K','L',
                'M','N','O','P','Q','R','S','T','U','V','W','X','Y',
                'Z','0','1','2','3','4','5','6','7','8','9'};

            return possibleAlphaNumericValues[GetRandomInteger(0, possibleAlphaNumericValues.Length - 1)];
    }

    /// <summary>
    /// Return a random integer between a lower bound and an upper bound.
    /// </summary>
    /// <param name="lowerBound">The lower-bound of the random integer that will be returned.</param>
    /// <param name="upperBound">The upper-bound of the random integer that will be returned.</param>
    /// <returns> Return a random integer between a lower bound and an upper bound.</returns>
    private int GetRandomInteger(int lowerBound, int upperBound)
    {
        uint scale = uint.MaxValue;

        // we never want the value to exceed the maximum for a uint, 
        // so loop this until something less than max is found.
        while (scale == uint.MaxValue)
        {
            byte[] fourBytes = new byte[4];
            _crypto.GetBytes(fourBytes); // Get four random bytes.
            scale = BitConverter.ToUInt32(fourBytes, 0); // Convert that into an uint.
        }

        var scaledPercentageOfMax = (scale / (double) uint.MaxValue); // get a value which is the percentage value where scale lies between a uint's min (0) and max value.
        var range = upperBound - lowerBound;
        var scaledRange = range * scaledPercentageOfMax; // scale the range based on the percentage value
        return (int) (lowerBound + scaledRange);
    }
}
rory.ap
  • 34,009
  • 10
  • 83
  • 174
2

I came up with this a while back which might serve what you're looking for.

/// <summary>
/// Return a string of random hexadecimal values which is 6 characters long and relatively unique.
/// </summary>
/// <returns></returns>
/// <remarks>In testing, result was unique for at least 10,000,000 values obtained in a loop.</remarks>
public static string GetShortID()
{
    var crypto = new System.Security.Cryptography.RNGCryptoServiceProvider();
    var bytes = new byte[5];
    crypto.GetBytes(bytes); // get an array of random bytes.      
    return BitConverter.ToString(bytes).Replace("-", string.Empty); // convert array to hex values.
}

I understand your requirement is that it "must" be unique, but remember, uniqueness is at best a relative concept. Even our old friend the GUID is not truly unique:

...the probability of the same number being generated randomly twice is negligible

If I recall correctly, I found my code wasn't 100% unique with 5 characters over many, many iterations (hundreds of thousands or possibly low-millions -- I don't recall exactly), but in testing with 6, the result was unique for at least 10,000,000 values obtained in a loop.

You can test it yourself at length 5 and determine if it's unique enough for your purposes. Just switch the 6 to a 5 if you want.

Addendum: Some of the others have reminded me that you might need to consider thread safety. Here's a modified approach:

private static object _lock = new object();

/// <summary>
/// Return a string of random hexadecimal values which is 6 characters long and relatively unique.
/// </summary>
/// <returns></returns>
/// <remarks>In testing, result was unique for at least 10,000,000 values obtained in a loop.</remarks>
public static string GetShortID()
{
    lock(_lock)
    {
        var crypto = new System.Security.Cryptography.RNGCryptoServiceProvider();
        var bytes = new byte[5];
        crypto.GetBytes(bytes); // get an array of random bytes.      
        return BitConverter.ToString(bytes).Replace("-", string.Empty); // convert array to hex values.
    }
}
rory.ap
  • 34,009
  • 10
  • 83
  • 174
  • Thank you @roryap. I was looking for a code based solution which didnt involve loads of loops and/or database calls. but yes, I need to be able to guarantee uniqueness every time, or the exercise is pointless – Richard Gale Apr 14 '16 at 14:25
  • @Richard.Gale This doesn't involve loops or database calls. That's just for testing. This stands alone by itself. It's like a "short guid". – rory.ap Apr 14 '16 at 14:25
  • It's random, but not guaranteed unique... It's possible, however unlikely, to have a duplicate. – Scottie Apr 14 '16 at 14:26
  • @roryap you even said it yourself: "it wasn't all that unique with 5 characters" – Victor Sand Apr 14 '16 at 14:26
  • @Scottie -- like I said, I tested it for 10 million times for length 6, and it was unique. I forget what the results were for 5. Uniqueness is all relative. Even a GUID isn't truly unique. – rory.ap Apr 14 '16 at 14:27
  • @VictorSand -- I forget exactly. It could be perfectly unique. How unique does the OP need it. Even a GUID isn't truly unique. It might have been the case that it was unique for 1 million values. I don't recall. – rory.ap Apr 14 '16 at 14:28
  • @roryap I totally agree, and I think asking for 100% absolute unique is an unrealistic request and that OP should tell management so. However, he did ask for 100% absolute certain unique, not 1:10 million unique. – Scottie Apr 14 '16 at 14:28
  • @roryap. Thanks for your input, and yes, I see it doesnt call to the database or loop which is more efficient. There could be scope to move to a 6 char rather than 5 char code, so your solution could be a decent one. – Richard Gale Apr 14 '16 at 14:29
  • Sure, it's very unlikely to find duplicates, but "random" is not what the OP asked for. – Victor Sand Apr 14 '16 at 14:29
  • @VictorSand -- I understand your point but I don't think my answer deserves a down vote. It's perfectly viable in my opinion. – rory.ap Apr 14 '16 at 14:29
  • @VictorSand -- who cares about random? that's just a characteristic of how the value is produced. – rory.ap Apr 14 '16 at 14:30
  • @VictorSand -- I think you should reconsider your down vote, assuming it was yours. I'm adamant that this is a workable and clever solution. – rory.ap Apr 14 '16 at 14:30
  • @Richard.Gale -- You could test it with length 5, again, I don't recall. It might be sufficiently unique. – rory.ap Apr 14 '16 at 14:31
  • I agree (given the discussion afterwards). The downvote was because you directly contradicted OPs specific request. Maybe you could word it a little differently (or argue why this is good enough). I'll remove the downvote! – Victor Sand Apr 14 '16 at 14:31
  • @roryap Many thanks for your solution, I liked the look of it when you posted, and I think that although it doesn't completely guarantee uniqueness, it fits in with the way we use our systems and give a high enough improbability of re-generating a repeat code for us to run with it. Thanks again. – Richard Gale Apr 14 '16 at 15:01
  • @Richard.Gale Actually I'm just discovering that I've been misleading myself and you also. I've been testing it, and it turns out that because each byte represents two characters when converted, the byte array of length 6 returns a string of length 12 (5 would be 10). I tried it with a byte array of length 3 which returned a string of length 6, and I found it was nowhere near unique over 10M tries. In my opinion, uniqueness with that short a length -- at least using this approach -- is just not a reasonable expectation. I'll keep at it though... – rory.ap Apr 14 '16 at 15:19
  • @Richard.Gale -- see the other answer I just added. You might find it more suitable in light of my last comment here. – rory.ap Apr 14 '16 at 16:20
0

I would create a table (PromoCode) with all possible 5 digit strings. Then create another table CampaignPromoCode with two fields:

Code varchar(5), CampaignId uniqueidentifier

This way you can keep track of used ones for each campaign. To get a random unused promocode use this statement:

select top 1 Code 
from PromoCode pc
    left join CampaignPromoCode cpc on cpc.Code = pc.Code
where cpc.CampaignId = 'campaign-id-goes-here'
  and cpc.Code is null
order by newid()
Dmitri Trofimov
  • 753
  • 3
  • 14
  • 2 seperate users accessing that table at the same time, can still come back with the same code. – Richard Gale Apr 14 '16 at 14:32
  • Yes, but you can write your method as a singleton so that only 1 user at a time can use it. – Scottie Apr 14 '16 at 14:35
  • @Richard.Gale: Concurrent access can be prevented. The bigger issue is that this table would need to perform well with over 60,000,000 records. – Cᴏʀʏ Apr 14 '16 at 14:35
  • To prevent concurrency issues you may use Singleton pattern, or create a stored procedure that would lock table, get the next code, and give it back to you. This has a good example: http://stackoverflow.com/questions/3662766/sql-server-how-to-lock-a-table-until-a-stored-procedure-finishes. The PromoCode table doesn't have to be that big, basically you can decrease it, because I really doubt that any campaign will use all 2,176,782,336 codes. – Dmitri Trofimov Apr 14 '16 at 14:43
0

Bottom line here is that you should probably go back to management and tell them that the requirement to force 100% absolute unique is a very cost prohibitive request and that you can get a 99.9999% unique for a fraction of the cost. Then use roryap's code to generate a random, mostly unique code.

Scottie
  • 11,050
  • 19
  • 68
  • 109