0

I'm trying to create a large array of random numbers using LINQ.

I want to generate 1,000,000 numbers ranging from 1 - 2147483647.

The following code works well for small numbers:

    int[] numbers = Enumerable.Range(1, 2147483647)
                              .OrderBy(x => rnd.Next())
                              .Take(num)
                              .ToArray();

But produces a System.OutOfMemory exception when trying to generate a large array.

What's the best way to achieve what I am looking for?

Edit: Thanks for the help so far, I'll write why I'm doing this and my full program code:

Regarding the array, it should not contain duplicates.

I am writing a program that will iterate through all the numbers, pair them up and return the pair with the smallest difference between them. Or return a list of all pairs with the smallest difference if their are duplicates.

Full program code:

    static void Main(string[] args)
    {
        // Keep running until close
        while (true)
        {
            Console.WriteLine("Write a number:");
            Console.WriteLine(ClosestNumbers(Convert.ToInt32(Console.ReadLine())));
        }
    }

    public static string ClosestNumbers(int num)
    {
        string returnString = "\n\nRandom numbers:\n";
        returnString += "---------------------------------------\n";

        Random rnd = new Random();

        // Generate array of {num} random numbers ranging from 1 to 2147483647.
        int[] numbers = Enumerable.Range(1, 1000000)
                                  .OrderBy(x => rnd.Next(1, 2147483647))
                                  .Take(num)
                                  .ToArray();

        //returnString += string.Join(",", numbers.ToArray()) + "\n";

        // Array format: {num1, num2, difference}
        List<int[]> pairedDifferences = new List<int[]>();

        int endPoint = numbers.Length;
        int difference = 0;

        for (int i = 0; i < endPoint - 1; i++)
        {

            for (int a = i + 1; a < endPoint; a++)
            {

                if (numbers[i] > numbers[a])
                {
                    difference = numbers[i] - numbers[a];
                }
                else
                {
                    difference = numbers[a] - numbers[i];
                }

                pairedDifferences.Add(new int[] { numbers[i], numbers[a], difference });

            }

        }

        int minDiff = pairedDifferences.Min(x => x[2]);

        List<int[]> minDiffsList = pairedDifferences.Where(x => x[2] == minDiff).ToList();

        returnString += "---------------------------------------\n\n\n";
        returnString += "Smallest difference(s) found between:\n\n";            

        foreach (int[] minDiffItem in minDiffsList)
        {
            // minDiffItem[0];      // first num
            // minDiffItem[1];      // second num
            // minDiffItem[2];      // difference

            returnString += $"{minDiffItem[0]} and {minDiffItem[1]}, with a difference of {minDiffItem[2]}.\n";
        }

        returnString += "\n\n\n===================================================================\n";
        returnString += "===================================================================\n\n\n";

        return returnString;
    }

Edit 2:

I'm now getting another OutOfMemory exception at the pairedDifferences.Add(new int[] { numbers[i], numbers[a], difference }); line. Does anyone know a better way to do this? Sorry this is the first time I'm doing something like this.

Ben Murphy
  • 49
  • 1
  • 5
  • 4
    Sounds like AB problem, can you share why you need to create array of random numbers this big ? – Ondrej Svejdar Aug 14 '17 at 10:51
  • 2
    "I want to generate 1,000,000 numbers ranging from 1 - 2147483647." Excluding duplicates? – vc 74 Aug 14 '17 at 10:54
  • This post can help you understand the limits of an array: https://stackoverflow.com/questions/1391672/what-is-the-maximum-size-that-an-array-can-hold – Martin Lindgren Aug 14 '17 at 10:57
  • Yes excluding duplicates. And I am writing a program that will iterate through all the numbers, pair them up and return the pair with the smallest difference between them. Or return a list of all pairs with the smallest difference if their are duplicates. – Ben Murphy Aug 14 '17 at 10:59
  • @BenMurphy And do you need the random numbers to be equally balanced in the array. i.e. Only 1 random number in the [1..2147483647 / 1000000] range for instance, or there might be more than 1 random number within this range? – vc 74 Aug 14 '17 at 11:09
  • @vc74 So long as there are 1,000,000 numbers within the 1 - 2147483647 range with no duplicates, that's the only rule. – Ben Murphy Aug 14 '17 at 11:20

3 Answers3

4

Your code is poorly optimized. The Random class lets you specify a range from which you take a random number. That way you don't need to order everything, which is very expensive. Try this

int[] numbers = Enumerable.Range(1, 1000000)
                          .Select(i => rnd.Next(1, 2147483647))
                          .ToArray();

Edit:

It was previously unclear that duplicates are not allowed. If that is the case, I would add a HashSet to keep track of the already contained numbers, if the number of obtained random numbers is considerably smaller than the range they have to be in, which is the case here.

var memory = new HashSet<int>();
int[] numbers = Enumerable.Range(1, 1000000)
                          .Select(i =>
                          {
                               int number;
                               do
                               {
                                    number = rnd.Next(1, 2147483647);
                               } while (memory.Contains(number));
                               return number;
                          })
                          .ToArray();

Also check this question for more on generating random numbers without duplicates.

larsbe
  • 397
  • 3
  • 10
  • Thanks, your solution worked, but now I'm getting another OutOfMemory exception at the `pairedDifferences.Add(new int[] { numbers[i], numbers[a], difference });` line. Do you know a better way to do this? Sorry this is the first time I'm doing something like this. – Ben Murphy Aug 14 '17 at 11:26
  • At this point you are trying to add `10^6*10^6` elements to the list `pairedDifference`, which is simply unreasonable. Try to think about solving the problem in different way. – larsbe Aug 14 '17 at 12:02
0

If you need the random number generation to be balanced, you can use the following code:

const int rangeCount = 1_000_000;
int rangeSize = 2_147_483_647 / rangeCount;

int[] numbers = Enumerable.Range(1, rangeCount)
                          .Select(rangeIndex => 
                           {
                             int from = ((rangeIndex - 1) * rangeSize) + 1;
                             int to = from + rangeSize;

                             return rnd.Next(from, to);
                           })
                          .ToArray();

which should produce 1 random number for each 2147483647 / 1000000 numbers range.

vc 74
  • 37,131
  • 7
  • 73
  • 89
-1

It has to do with the OrderBy function. That is a O(n) on memory usage, that is your problem.

O(n) constitutes a linear memory usage. So more input means linearly more memory.

Jordy van Eijk
  • 2,718
  • 2
  • 19
  • 37