0

I need to randomly select k elements from a list of n. Let's say that:

n = (1,2,3,4,5,6,7,8,9,10)

and I want to randomly chose k = 4 elements and arrange them in random order. I am using Perl, so I could easily do this with:

@ord = ($o1,$o2,$o3,$o4) = pick(4,(1..10));

However, the complication is for certain pairs (not all pairs), for example, if 3 is chosen then 4 should not be chosen (I'll call these disjoint pairs). However, if 1 is chosen, the likelihood of any other element being chosen should not be affected (1 is not part of a disjoint pair). In other words, the selection of certain elements is independent of the selection of other elements, but the selection of 1 of the disjoint pairs should exclude the selection of the other.

So, let's say (3,4) and (7,8) are the only disjoint pairs. Can someone suggest an efficient algorithm that randomly selects k = 4 elements with an equal chance of selection from the list (1,2,3,4,5,6,7,8,9,10) unless one of the disjoint elements is selected in which case the other element in the disjoin pair would be excluded from subsequent selection?

Dan
  • 165
  • 5
  • 18

1 Answers1

0

I ended up coming up with a method where I did not have to iterate through the array. I did this first in R and then in Perl (because I am more proficient in R).

Essentially, I sorted the array in question (n or ord0) in random order, then identified the array index for each element in the first disjoint pair, and removed the maximum of those indexes from the array. I then repeated this process for each disjoint pair. Finally, I choose the first k = 4 elements from the array after this process.

Here is my R code:

n <- 1:10
k <- sample(n,10)

nchoose <- 4

i3 <- grep(3,k)
i4 <- grep(4,k)

k <- k[-max(i3,i4)]

i7 <-grep(7,k)
i8 <-grep(8,k)

k <- k[-max(i7,i8)]

n
k

final <- k[1:nchoose]
final

Here is my Perl code. I found it easier to use the first_index method in Perl instead of grep():

use List::MoreUtils qw(first_index);

@ord0 = pick(10,(1..10));

splice @ord0, max(first_index { $_ eq 3 } @ord0,first_index { $_ eq 4 } @ord0), 1;
splice @ord0, max(first_index { $_ eq 7 } @ord0,first_index { $_ eq 8 } @ord0), 1;

@ord = ($o1,$o2,$o3,$o4) = @ord0[0..3];
Dan
  • 165
  • 5
  • 18