0

Would anyone happen to know how to script a shuffle of a dataset in R, such that if I have 25 numbers (5 rows x 5 columns) in a dataframe, and I shuffle 25 separate times, each number appears in each location exactly one time?

Thus it's not entirely random, at least not after the first shuffle, as the potential locations of any number decrease with each shuffle.

Thank you!

chazmatazz
  • 133
  • 1
  • 9

2 Answers2

2

I'll demonstrate the solution on 3 by 3 datasets. First thing I would do is convert the data.frame to matrix to be able to easily apply permutations.

Let's say we have a 3x3 matrix:

set.seed(1)
m <- matrix(sample(1:100, 9), nrow = 3)
m
#>      [,1] [,2] [,3]
#> [1,]   68   34   14
#> [2,]   39   87   82
#> [3,]    1   43   59

Then each shuffle can be defined by a permutation of numbers 1 to 9.

shuffle <- c(9, 4, 7, 1, 8, 3, 2, 5, 6)
matrix(m[shuffle], nrow = 3)
#>      [,1] [,2] [,3]
#> [1,]   59   68   39
#> [2,]   34   82   87
#> [3,]   14    1   43

So our task then is to generate 9 such permutations where each number occurs on each position exatly once. E.g. having first shuffle c(9, 4, 7, 1, 8, 3, 2, 5, 6), we can't have c(9, 2, 7, 3, 8, 5, 4, 6, 1) then because 9 has already been on the first place, 7 on the third and 8 on the fifth.

Basically what we need is a 9 by 9 latin square. Fortunately, there is a package for such things:

library(magic)
#> Loading required package: abind
set.seed(1)
shuffles_matrix <- rlatin(9)
shuffles_matrix
#>       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#>  [1,]    6    5    4    2    3    9    8    1    7
#>  [2,]    4    2    7    6    9    8    1    3    5
#>  [3,]    8    3    1    5    2    7    9    4    6
#>  [4,]    5    1    9    7    6    2    4    8    3
#>  [5,]    3    6    5    1    8    4    7    9    2
#>  [6,]    9    7    8    3    1    6    5    2    4
#>  [7,]    7    9    3    4    5    1    2    6    8
#>  [8,]    2    8    6    9    4    5    3    7    1
#>  [9,]    1    4    2    8    7    3    6    5    9

Now we can treat each row of this square as a shuffle of our original 3x3 matrix:

shuffles <- split(shuffles_matrix, 1:9)
shuffles
#> $`1`
#> [1] 6 5 4 2 3 9 8 1 7
#> 
#> $`2`
#> [1] 4 2 7 6 9 8 1 3 5
#> 
#> $`3`
#> [1] 8 3 1 5 2 7 9 4 6
#> 
#> $`4`
#> [1] 5 1 9 7 6 2 4 8 3
#> 
#> $`5`
#> [1] 3 6 5 1 8 4 7 9 2
#> 
#> $`6`
#> [1] 9 7 8 3 1 6 5 2 4
#> 
#> $`7`
#> [1] 7 9 3 4 5 1 2 6 8
#> 
#> $`8`
#> [1] 2 8 6 9 4 5 3 7 1
#> 
#> $`9`
#> [1] 1 4 2 8 7 3 6 5 9

And this is how we apply these shuffles to the matrix:

library(purrr)
shuffles %>% 
  map(~matrix(m[.], nrow = 3))
#> $`1`
#>      [,1] [,2] [,3]
#> [1,]   43   39   82
#> [2,]   87    1   68
#> [3,]   34   59   14
#> 
#> $`2`
#>      [,1] [,2] [,3]
#> [1,]   34   43   68
#> [2,]   39   59    1
#> [3,]   14   82   87
#> 
#> $`3`
#>      [,1] [,2] [,3]
#> [1,]   82   87   59
#> [2,]    1   39   34
#> [3,]   68   14   43
#> 
#> $`4`
#>      [,1] [,2] [,3]
#> [1,]   87   14   34
#> [2,]   68   43   82
#> [3,]   59   39    1
#> 
#> $`5`
#>      [,1] [,2] [,3]
#> [1,]    1   68   14
#> [2,]   43   82   59
#> [3,]   87   34   39
#> 
#> $`6`
#>      [,1] [,2] [,3]
#> [1,]   59    1   87
#> [2,]   14   68   39
#> [3,]   82   43   34
#> 
#> $`7`
#>      [,1] [,2] [,3]
#> [1,]   14   34   39
#> [2,]   59   87   43
#> [3,]    1   68   82
#> 
#> $`8`
#>      [,1] [,2] [,3]
#> [1,]   39   59    1
#> [2,]   82   34   14
#> [3,]   43   87   68
#> 
#> $`9`
#>      [,1] [,2] [,3]
#> [1,]   68   82   43
#> [2,]   34   14   87
#> [3,]   39    1   59
Iaroslav Domin
  • 2,698
  • 10
  • 19
  • Thank you. I think this is close to what I'm trying to do, but I notice that there is an order of rotation here; i.e. any one of the numbers (say, 1, for example in your shuffled matrix above) just moves through a pattern in the matrix through each iteration. Thus they're just moving along a set order. What I need is for each shuffle to be "random", but with each number occupying a position one time...in the sense that the first shuffle is indeed random, and that decreases in randomness with each shuffle as there is n-1 fewer possible places that a specific number can be placed each time. – chazmatazz Nov 14 '19 at 14:32
  • For context, I have plants in a greenhouse that need shuffled regularly in a cyclical manner, but due to environmental gradients and the number of plants, they can't simply move along a line in any one direction as they won't have a chance to experience a similar level of environmental fluctuation in the time between seeding and measurement of traits of interest. – chazmatazz Nov 14 '19 at 14:36
  • 1
    @chazmatazz I've updated my answer 2hrs ago to use `rlatin` instead of `latin`, now the latin square looks "random". Is it what you are talking about? – Iaroslav Domin Nov 14 '19 at 14:40
  • Ah yes, it appears to be indeed. Thanks! Now, if only my pc didn't bog down so much running rlatin... – chazmatazz Nov 14 '19 at 14:48
1

I think Iaroslav's answer is excellent. I used some different functions to basically do the same thing so I thought I would share some other code. Basically I also created a latin square type formation but I didn't realize that was the name. I did that with

roll <- function(x, i) {
  if (i==0) return(x)
  c(x[-(1:i)], x[1:i])
}
m <- sapply(0:24, function(i) roll(1:25, i))

here I just uses the number 1:25. It creates a matrix where each row or column is a set of indices that can be used to permute your values. If it looks too orderly, you can also shuffle the rows and columns of the matrix with another helper function

shuffle_mat <- function(x, N=50, margin=c(1,2)) {
  mg <- sample(margin, N, replace=TRUE)
  n_row_swap = sum(mg==1)
  sr <- replicate(n_row_swap, sample.int(nrow(x), 2))
  for(i in 1:ncol(sr)) {
    x[sr[,i],]<-x[rev(sr[,i]),]
  }
  n_col_swap = sum(mg==2)
  sc <- replicate(n_col_swap, sample.int(ncol(x), 2))
  for(i in 1:ncol(sc)) {
    x[,sc[,i]]<-x[,rev(sc[,i])]
  }
  x
}    
rr <- shuffle_mat(m)

Then again you can take each of those rows/columns and shape them into a 5x5 matrix.

MrFlick
  • 195,160
  • 17
  • 277
  • 295