16

Fill array a from a[0] to a[n-1]: generate random numbers until you get one that is not already in the previous indexes.

This is my implementation:

public static int[] first(int n) {
    int[] a = new int[n];
    int count = 0;

    while (count != n) {
        boolean isSame = false;
        int rand = r.nextInt(n) + 1;

        for (int i = 0; i < n; i++) {
            if(a[i] == rand) isSame = true;
        }

        if (isSame == false){
            a[count] = rand;
            count++;
        }
    }

    return a;
}

I thought it was N^2 but it's apparently N^2logN and I'm not sure when the log function is considered.

r3mainer
  • 23,981
  • 3
  • 51
  • 88
btrballin
  • 1,420
  • 3
  • 25
  • 41
  • 7
    Looks to me it doesn't have an upper bound in time complexity. You can keep failing to generate a unique number. Can you give a reference where it says this is *O(n^2 log n)*? – Willem Van Onsem Feb 18 '15 at 20:33
  • 4
    are we talking about average time? – sodik Feb 18 '15 at 20:34
  • 7
    Can't say I have much interest in analyzing the big-O of a hideous algorithm for which a trivial O(n) replacement exists. – Lee Daniel Crocker Feb 18 '15 at 20:41
  • If you are looking for a more efficient algorithm, it can be done in *O(n)* (worst case). – Willem Van Onsem Feb 18 '15 at 20:44
  • it can be done using a shuffle, for example, in O(n). – ryanpattison Feb 18 '15 at 20:48
  • 1
    This is a textbook probabilistic algorithm, you wouldn't actually implement it – dfb Feb 18 '15 at 20:53
  • 2
    @rpattiso what if the range over which you are selecting random numbers greatly exceeds the size of a list you are willing to create? – Random832 Feb 18 '15 at 21:21
  • @Random832 well, It seems the problem statement is more general than the code: `r.nextInt(n) + 1` is between 1 and n and n is the size of the array so shuffle is a good replacement in the code. good catch :) – ryanpattison Feb 18 '15 at 21:32
  • This code is not guaranteed to run in any given running time because `r.nextInt(n)` may never give you the value you want. It's unlikely, but if you want to analyze *worst case* scenario, you can't. Worst case, it gets X it's first iteration and continues to get X's for the rest of eternity. – corsiKa Feb 18 '15 at 23:20
  • This is a randomized algorithm, in fact a [Las Vegas algorithm](http://en.wikipedia.org/wiki/Las_Vegas_algorithm). Analysis of such algorithms is not commonly thought at undergraduate level. http://cs.stackexchange.com is probably a better place to ask such questions. – Fizz Feb 19 '15 at 03:49
  • [cs.SE] would expect more of an own attempt, maybe following [our reference question](http://cs.stackexchange.com/questions/23593/is-there-a-system-behind-the-magic-of-algorithm-analysis) on algorithm analysis. – Raphael Feb 19 '15 at 08:12

4 Answers4

36

The 0 entry is filled immediately. The 1 entry has probability 1 - 1 / n = (n - 1) / n of getting filled by a random number. So we need on average n / (n - 1) random numbers to fill the second position. In general, for the k entry we need on average n / (n - k) random numbers and for each number we need k comparisons to check if it's unique.

So we need

n * 1 / (n - 1) + n * 2 / (n - 2) + ... + n * (n - 1) / 1

comparisons on average. If we consider the right half of the sum, we see that this half is greater than

n * (n / 2) * (1 / (n / 2) + 1 / (n / 2 - 1) + ... + 1 / 1)

The sum of the fractions is known to be Θ(log(n)) because it's an harmonic series. So the whole sum is Ω(n^2*log(n)). In a similar way, we can show the sum to be O(n^2*log(n)). This means on average we need

Θ(n^2*log(n))

operations.

JuniorCompressor
  • 19,631
  • 4
  • 30
  • 57
15

This is similar to the Coupon Collector problem. You pick from n items until you get one you don't already have. On average, you have O(n log n) attempts (see the link, the analysis is not trivial). and in the worst case, you examine n elements on each of those attempts. This leads to an average complexity of O(N^2 log N)

dfb
  • 13,133
  • 2
  • 31
  • 52
  • Why? This analysis doesn't have anything to do with the inner loop. The outer loop runs on average O(n lg n) times, The extra n factor is from the inner loop, whether it breaks or not – dfb Feb 18 '15 at 20:44
  • @JohnPirie - n^2 is a lower bound, but it's not very tight. The outer loop is counting the number of attempts, there is no guarantee that the number of attempts is only n - it will be at least n and on average O(n lg n) – dfb Feb 18 '15 at 21:01
  • 2
    "average" "O(n lg n)" uhhh, Big O notation is for the *worst case* not the *average case*. =\ – corsiKa Feb 18 '15 at 23:17
  • 8
    @corsiKa - Big-O is an asymptotic bound on a function. It's perfectly valid to say that the expected runtime of a function is upper bounded, just like you might bound the worst case runtime. – dfb Feb 18 '15 at 23:35
  • 3
    There is nothing in the definition of Big-O notation that says it's about the worst case. [Tilde notation](http://introcs.cs.princeton.edu/java/41analysis/), which is not widely used, is actually *more* restrictive; 2x is O(x), but it is not the case that 2x ~ x. – user2357112 Feb 19 '15 at 00:49
  • 1
    @JeroenVannevel [You are wrong](http://cs.stackexchange.com/questions/23068/how-do-o-and-relate-to-worst-and-best-case). – Raphael Feb 19 '15 at 08:10
  • It is not very proper to use a worst-case estimate (for the inner loop) to find the average complexity. But of course we know that, for linear search, the two differ only by a constant factor 2, so that makes this answer correct anyway. – Marc van Leeuwen Feb 19 '15 at 10:06
2

The algorithm you have is not O(n^2 lg n) because the algorithm you have may loop forever and not finish. Imagine on your first pass, you get some value $X$ and on every subsequent pass, trying to get the second value, you continue to get $X$ forever. We're talking worst case here, after all. That would loop forever. So since your worst case is never finishing, you can't really analyze.

In case you're wondering, if you know that n is always both the size of the array and the upper bound of the values, you can simply do this:

int[] vals = new int[n];
for(int i = 0; i < n; i++) {
    vals[i] = i;
}
// fischer yates shuffle
for(int i = n-1; i > 0; i--) {
   int idx = rand.nextInt(i + 1);
   int t = vals[idx];
   vals[idx] = vals[i];
   vals[i] = t;
}

One loop down, one loop back. O(n). Simple.

corsiKa
  • 81,495
  • 25
  • 153
  • 204
  • A random number generator that generates the same number every time is not a random number generator. – Jack Aidley Feb 19 '15 at 11:07
  • @JackAidley a random number generator that _cannot_ generate the same number every time is not a random number generator. (But a PRNG would not have this behaviour). – Davidmh Feb 19 '15 at 11:50
  • David is correct that modern implementations of it *would* not behave that way, but we can not guarantee it. After all, we're talking about worst possible case, not worst expected case. – corsiKa Feb 19 '15 at 15:18
-1

If I'm not mistaken, the log N part comes from this part:

for(int i = 0; i < count; i++){
    if(a[i] == rand) isSame = true;
}

Notice that I changed n for count because you know that you have only count elements in your array on each loop.

Manuel Ramírez
  • 2,275
  • 1
  • 13
  • 15