1

I need to generate a unique code with 5 lengths with the given number. In other words, I need to encode natural number to 5 length unique code

I wanna give fixed-length rememberable code to the customer, and keep the sequential number in the database and encode or decode when needed.

The given number can be in the range of 1 to 9999999. But the result always must be 5 lengths.

for example

1 => a56er or 2 => c7gh4

Uniqueness is important I googled a lot and I can't find a solution.

2 Answers2

6

The given number can be in the range of 1 to 9999999

Right. So you need to encode 24 bits of information, and you have 5 characters in which to do that - so you need 5 bits per character. That's pleasantly in the range of "only digits and lower case ASCII characters" and you can even remove points of confusion like "o/0" and "i/1".

Note that this isn't in any way "secure" - it's entirely predictable and reversible. If you don't want customers being able to reverse engineer their sequence number from the encoded form, it won't work. But it's a simple way of encoding a number as a fixed-length string.

Sample code showing encoding and decoding:

using System;
using System.Globalization;

public class Test
{
    static void Main()
    {
        EncodeDecode(10);
        EncodeDecode(100);
        EncodeDecode(1000);
        EncodeDecode(10000);
        EncodeDecode(100000);
        EncodeDecode(1000000);
        EncodeDecode(9999999);
        
        void EncodeDecode(int number)
        {
            string encoded = EncodeBase32(number);
            int decoded = DecodeBase32(encoded);
            Console.WriteLine($"{number} => {encoded} => {decoded}");
        }
   }
    
    private const string Base32Alphabet = 
        "23456789abcdefghjklmnpqrstuvwxyz";
    private static string EncodeBase32(int number)
    {
        // TODO: Range validation
        char[] chars = new char[5];
        for (int i = 0; i < 5; i++)
        {
            chars[i] = Base32Alphabet[number & 0x1f];
            number = number >> 5;
        }
        return new string(chars);
    }
    
    private static int DecodeBase32(string text)
    {
        if (text.Length != 5)
        {
            throw new ArgumentException("Invalid input: wrong length");
        }
        int number = 0;
        for (int i = 4; i >= 0; i--)
        {
            number = number << 5;
            int index = Base32Alphabet.IndexOf(text[i]);
            if (index == -1)
            {
                throw new ArgumentException("Invalid input: invalid character");
            }
            number |= index;
        }
        return number;
    }
}

Output:

10 => c2222 => 10
100 => 65222 => 100
1000 => az222 => 1000
10000 => jsb22 => 10000
100000 => 2p352 => 100000
1000000 => 2ljy2 => 1000000
9999999 => zm7kb => 9999999
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • thank you it's fine and solves my problem. but is there a way to make it random? – Aliasghar Ahmadpour Dec 10 '20 at 08:58
  • @AliasgharAhmadpour: Making it random and generating it from the input number are contradictory. It could be *pseudo-random* (so still predictable from the input), but you'd need to give more details about exactly what your requirements are in order to provide an answer for that. An alternative is to generate a *truly* random string (as random as you can, not based on any input) that you store in the database, with appropriate safeguards against duplicates, but that's not the question you asked. – Jon Skeet Dec 10 '20 at 09:15
  • Again, thanks for your attention. The given solution solved my problem. but I wanna know is there any way to make it non-sequential? The generated code is sequential: 0 => 22222 1 => 32222 2 => 42222 – Aliasghar Ahmadpour Dec 12 '20 at 05:06
  • @AliasgharAhmadpour: Again, you'd need to give more details about precise requirements. I answered the question you asked - if you have other requirements, you should ask a new question with those (and make sure they're completely unambiguous). Stack Overflow is not intended to accommodate questions that evolve gradually over time - it ends up being a frustrating experience for all involved. – Jon Skeet Dec 12 '20 at 07:39
  • I need to generate a unique code for each customer with the given number. I don't prefer to generate sequential code, because it's predictable. – Aliasghar Ahmadpour Dec 12 '20 at 08:56
  • @AliasgharAhmadpour: As I said in the previous comment, this answer satisfies your original requirements. If you now have different requirements, you should ask a new question, with *really* clear requirements. Note that with only 5 characters, if an attacker has multiple attempts they will be able to guess at *a* valid code after relatively few attempts, if you have a significant number of customers. – Jon Skeet Dec 12 '20 at 08:58
  • Suppose this 0 => 22222 1 => 32222 2 => 42222. easily I can guess the generated code for number 3. just it – Aliasghar Ahmadpour Dec 12 '20 at 08:58
  • 1
    I asked new question in https://stackoverflow.com/q/65263378/9221609 – Aliasghar Ahmadpour Dec 12 '20 at 09:26
  • 1
    and unfortunately it was quickly closed as duplicate (of this question). What can you do? – Michael Welch Dec 12 '20 at 14:23
  • The length of the alphabet is 31 here, which means you are doing `n & 31`. 31 is a prime number. Is this accidental or intentional? Or important, in the context of generating *uniqueness*? Could I use the alphabet "1234567" but not "12346"? – markson edwardson Jun 06 '21 at 04:37
  • @marksonedwardson: No, the length of the alphabet is 32, hence "base32". Sure, you can use a 6-character alphabet, but you'll need to use division and remainder rather than just bitwise-and and shifting. – Jon Skeet Jun 06 '21 at 07:05
  • Oh, right, Doh! Been reading too many different questions! – markson edwardson Jun 06 '21 at 08:58
0

Based on your requirements, I will first answer to your question, then give you what I think is the best solution.

So, based on your number 1 to 9999999, you can use a SHA256 or MD5 or any other hashing function to generate a string, then use Substring on a random part of the string, to get the code you ask for.

A more simple approach which I personally used is to just ignore the user input, and use Guid.NewGuid() function, which will generate a random string of 16 characthers, on which you can remove the "-" and take 5 random charachters with substring and get the code you want.

Guid.NewGuid() 

Gives you codes in a fashion like "a869ee3e-13b2-46ce-8c09-ff8998ab9393". Then you apply a

string.Replace("-")

and you get "a869ee3e13b246ce8c09ff8998ab9393" then you take a random piece of 5 charachters in the string (just do a substring and pass as starting point a Random number that is ranged between 1 and string lenght -5 (if you want a 5 lenght charachter).

Or to put more simple

Guid.NewGuid().ToString().Replace("-", "").Substring(x.Next(1,27),5)

This will give the code you're asking for

Liquid Core
  • 1
  • 6
  • 27
  • 52
  • @MichaelMao No. This will give you an alphanumeric code like asked by the question author. If you want a numeric code, you should just use Random.Next() and possibly apply a seed to apply more "randomization". An approach I use for more randomness is to generate a lot of random numbers with Random.Next, each with different seed, and with another Random.Next, choose one of them. They still are pseudo random numbers, but I never got an equal number, it's a very hard possibility. – Liquid Core Dec 10 '20 at 08:21
  • 1
    The approached here won't guarantee uniqueness. If you provide sample code, I'm happy to write a sample based on it that shows duplicates. (You could add uniqueness by retaining the generated string and keeping going until you've got a new value, but at that point it's really not an answer to the question that was asked.) – Jon Skeet Dec 10 '20 at 08:22
  • @JonSkeet Apply your example to my last code on NewGuid example. – Liquid Core Dec 10 '20 at 08:29
  • 1
    @LiquidCore: That's easy to prove that it *must* generate duplicates - because it will be 5 characters of hex, leading to a total of 16^5 = 1,048,576 possible outcomes. When the numbers are in the range 1 - 9,999,999, there *have* to be duplicates. – Jon Skeet Dec 10 '20 at 08:37
  • @JonSkeet In theory everything is possible, even to balance an elephant on a thootpick. But In my example I don't take any number in input and taking 5 same charachters in a generated Guid which are the same, well, never happened to me in a lot of years. Also, I gave a quick solution to OP, but I usual take random parts of the guid and mix them togheter to the desired lenght, even using random numbers and adding noise charachters to randomize even more. I gave the fastest approach possible. Never got duplicates tough – Liquid Core Dec 10 '20 at 08:39
  • I assume that if that's the range of numbers, and they're sequential, then it needs to be able to generate that number of unique strings. Sample code: https://gist.github.com/jskeet/1f21e8356eaea14da4fa31d70806c827 - over a couple of runs, that's generated about 8,950,000 duplicates each time. Even generating just 100,000 values, it's giving ~5000 duplicates. Even generating just *10,000* values, it's giving 50+ duplicates each time. So the solution as written *really* doesn't scale. – Jon Skeet Dec 10 '20 at 08:43
  • 1
    To put it in another way, you offer a solution which will give over a million different codes, while the question is about to have 10 million different codes. Therefore, with the best of luck, whenever OP will have 2 million numbers in its database, each will be repeated twice. Hence uniqueness is not respected. – Martin Verjans Dec 10 '20 at 08:43
  • (I'm not sure what tests you've been running to show "I gave the fastest approach possible. Never got duplicates tough" but it doesn't sound like you generated many values...) – Jon Skeet Dec 10 '20 at 08:44
  • @MartinVerjans just take random bits of string and don't make long 5 but more. Duplicates ever happened before. Bye – Liquid Core Dec 10 '20 at 09:07
  • @JonSkeet same as above – Liquid Core Dec 10 '20 at 09:07
  • Too lazy to write a linq that does that one line, but try yourself and see the magic happen – Liquid Core Dec 10 '20 at 09:08
  • It would indeed have to be magic to be able to cope with more than 1,048,576 possible strings while only using 5 hex digits. – Jon Skeet Dec 10 '20 at 09:13
  • @JonSkeet That's why I use 10 to 20 lenght strings – Liquid Core Dec 10 '20 at 10:12