10

I have a single direction linked list without knowing its size.

I want to get a random element in this list, and I just have one time chance to traverse the list. (I am not allowed to traverse twice or more)

What’s the algorithm for this problem? Thanks!

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
卢声远 Shengyuan Lu
  • 31,208
  • 22
  • 85
  • 130

4 Answers4

19

This is just reservoir sampling with a reservoir of size 1.

Essentially it is really simple

  1. Pick the first element regardless (for a list of length 1, the first element is always the sample).
  2. For every other element with probability 1/n where n is the number of elements observed so far, you replace the already picked element with the current element you are on.

This is uniformly sampled, since the probability of picking any element at the end of the day is 1/n (exercise to the reader).

C. K. Young
  • 219,335
  • 46
  • 382
  • 435
Aurojit Panda
  • 909
  • 6
  • 11
  • @Giacomon Is there a reason you feel this won't work for small collections. I understood the question to be to provide an online uniform sampling algorithm, this fits in just fine I think – Aurojit Panda Apr 29 '11 at 07:27
  • @Aurojit: I think Giacomo is just saying that this solution is good for both large and small collections. – C. K. Young Apr 29 '11 at 07:28
  • @Giacomo Thanks, I wasn't actually complaining about your comment, I was just trying to figure out the problem, sorry if it came off as being harsh – Aurojit Panda Apr 29 '11 at 07:30
  • Is it really 1/n probability? Take the last element: it will have a chance of 1/2 to be chosen (in the last step), and also the rest of elements have a sum of 1/2. It doesn't seem right to me. – Iulius Curt Apr 20 '12 at 18:53
  • @iuliux Why does it have probability 1/2. The element itself has a 1/n chance of being chosen. If it isn't chosen any chosen element is surviving with probability (n - 1)/n . (1/n-1) = 1/n. Hence uniform probability. – Aurojit Panda Apr 22 '12 at 17:23
  • Ok, I re-read your answer and now I got it, thanks. (step 2 is a bit ambiguous) – Iulius Curt Apr 22 '12 at 20:52
1

This is probably an interview question.Reservoir sampling is used by data scientist to store relevant data in limited storage from large stream of data.

If you have to collect k elements from any array with elements n, such that you probability of each element collected should be same (k/n), you follow two steps,

1) Store first k elements in the storage. 2) When the next element(k+1) comes from the stream obviously you have no space in your collection anymore.Generate a random number from o to n, if the generated random number is less than k suppose l, replace storage[l] with the (k+1) element from stream.

Now, coming back to your question, here storage size is 1.So you will pick the first node,iterate over the list for second element.Now generate the random number ,if its 1, leave the sample alone otherwise switch the storage element from list

Sanjana
  • 467
  • 2
  • 8
  • 20
0

This question can be done using reservoir sampling. It is based on choosing k random items out of n items, but here n can be very large(which doesn't has to fit in memory!) and (as in your case) unknown initially.

The wikipedia has an understandable algorithm which i quote below:

array R[k];    // result
integer i, j;

// fill the reservoir array
for each i in 1 to k do
    R[i] := S[i]
done;

// replace elements with gradually decreasing probability
for each i in k+1 to length(S) do
    j := random(1, i);   // important: inclusive range
    if j <= k then
        R[j] := S[i]
    fi
done

The question requires only 1 value so we take k=1.

C implementation :

https://ideone.com/txnsas

rahuljain1311
  • 1,822
  • 19
  • 20
0

This is the easiest way that I have found, it works fine and is understandable:

public int findrandom(Node start) {
    Node curr = start;
    int count = 1, result = 0, probability;
    Random rand = new Random();

    while (curr != null) {
        probability = rand.nextInt(count) + 1;
        if (count == probability)
            result = curr.data;
        count++;
        curr = curr.next;
    }
    return result;
}
Abhishek Honey
  • 645
  • 4
  • 13