How to produce a a randomized number sequence within seven block without boundary repeat using R?

Question

There are 9 treatments and we want to have 7 blocks. In each block, the treatment should be repeated once.

The 9 treatments are marked as follows:

-Treatment 1 (1-7)
-Treatment 2 (8-14)
-Treatment 3 (15-21)
-Treatment 4 (22-28)
-Treatment 5 (29-35)
-Treatment 6 (36-42)
-Treatment 7 (43-49)
-Treatment 8 (50-56)
-Treatment 9 (57-63)

Each number represents a pot. We want these pots randomised in 7 blocks (columns) but we don't want two pot of the same treatment adjacent to each other - highlighted in grey:

How would I go about this in R?

I don't understand the relationship between treatments and blocks. My first thought was that each block would have one from each of the treatment groups, but the picture of your data does not support that. Are you supposed to have all treatments included in each block? — r2evans, May 24 '21 at 12:04

r2evans · Accepted Answer · 2021-05-27T17:44:05.750

If I'm interpreting it correctly, this should work.

We'll do a two-step sampling:

First, sample the treatment group itself, making it much easier to determine if a particular row in the block is in the same treatment group as the same row, previous block.
Second, sample one from each of the proven-safe groups.

I'll use a random seed here for reproducibility, do not use set.seed(.) in production.

set.seed(42)
nBlocks <- 7
treatments <- list(1:7, 8:14, 15:21, 22:28, 29:35, 36:42, 43:49, 50:56, 57:63)
blocks <- Reduce(function(prev, ign) {
  while (TRUE) {
    this <- sample(length(treatments))
    if (!any(this == prev)) break
  }
  this
}, seq.int(nBlocks)[-1], init = sample(length(treatments)), accumulate = TRUE)
blocks <- do.call(cbind, blocks)
blocks
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#  [1,]    1    3    4    2    8    2    1
#  [2,]    5    1    2    4    5    7    9
#  [3,]    9    8    9    3    1    3    5
#  [4,]    7    9    3    6    7    9    3
#  [5,]    2    4    8    5    4    1    4
#  [6,]    4    7    1    9    6    4    2
#  [7,]    8    6    5    7    2    6    8
#  [8,]    3    5    6    8    9    5    6
#  [9,]    6    2    7    1    3    8    7

Here each column is a "block", and each number represents the treatment group assigned to each row. You can see that no rows contain the same group in subsequent columns.

For instance, the first column ("block 1") will have something from the Treatment 1 group in the first row, Treatment 5 group in row two, etc. Further, inspection will show that all treatments are included in each block column, an inferred requirement of the experimental design.

(FYI, it is theoretically possible that this will take a while based on the random conditions. Because it repeats per-column, it should be relatively efficient, though. I have no safeguards here for too-long-execution, but I don't think it is required: the conditions here do not lend to a high likelihood of "failure" requiring much repetition.)

The next step is to convert each of these group numbers into a number from the respective treatment group.

apply(blocks, 1:2, function(ind) sample(treatments[[ind]], 1))
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#  [1,]    6   17   22   11   54   14    3
#  [2,]   30    3   13   22   33   48   58
#  [3,]   63   55   61   15    4   21   33
#  [4,]   49   60   21   36   43   58   21
#  [5,]   12   25   55   32   27    7   25
#  [6,]   24   46    4   58   38   28   11
#  [7,]   53   38   35   49   11   36   56
#  [8,]   16   29   36   56   63   29   40
#  [9,]   36    8   47    3   19   50   43

To verify, in the first matrix, our first three rows (block 1) were 1, 5, and 9, which should translate into 1-7, 29-35, and57-63, respectively. "6" is within 1-7, "30" is within 29-35, and "63" is within 59-63. Inspection will show the remainder to be correct.

Because of the step of determining treatment groups first, it is much simpler to verify/guarantee that you will not repeat treatment groups in a row between two adjacent blocks.

EDIT

Rules:

The same treatment group may not be on the same row in adjacent columns; and
The same treatment (not group) may not be in any row in adjacent columns.

We can use the same methodology as before. Note that as any groups become smaller, the iteration time may increase but I do not expect it likely to get into an infinite loop. (However, if you inadvertently have a group of length 1, then ... this will never end.)

nBlocks <- 7
treatments <- list(1:7, 8:14, 15:21, 22:28, 29:35, 36:42, 43:49, 50:56, 57:63)

# helper function for randomized selection of treatments given groups
func <- function(grp) cbind(grp, sapply(treatments[grp], sample, size = 1))
set.seed(42)
func(c(1,3,5))
#      grp   
# [1,]   1  1
# [2,]   3 19
# [3,]   5 29

And then the same Reduce mindset:

set.seed(42)
blocks <- Reduce(function(prev, ign) {
  while (TRUE) {
    this1 <- sample(length(treatments))
    if (!any(this1 == prev[,1])) break
  }
  while (TRUE) {
    this2 <- func(this1)
    if (!any(this2[,2] %in% prev[,2])) break
  }
  this2
}, seq.int(nBlocks-1), init = func(sample(length(treatments))), accumulate = TRUE)
blocks <- do.call(cbind, blocks)
groups <- blocks[, seq(1, by = 2, length.out = nBlocks)]
treats <- blocks[, seq(2, by = 2, length.out = nBlocks)]

From this, we have two products (though you will likely only care about the second):

The treatment groups, good to verify rule 1 above: no group may be in the same row in adjacent columns:

groups
#       grp grp grp grp grp grp grp
#  [1,]   1   3   1   7   8   5   1
#  [2,]   5   1   2   8   2   7   3
#  [3,]   9   8   5   2   1   4   6
#  [4,]   7   9   6   3   4   8   5
#  [5,]   2   4   7   9   3   9   4
#  [6,]   4   7   4   5   7   1   2
#  [7,]   8   6   9   1   9   6   7
#  [8,]   3   5   8   6   5   2   9
#  [9,]   6   2   3   4   6   3   8

The treatments themselves, for rule 2 above, where no treatment may be in adjacent columns:

treats
#                           
#  [1,]  7 19  2 47 51 33  3
#  [2,] 35  4 12 50  8 44 15
#  [3,] 60 51 35 10  1 22 41
#  [4,] 43 58 41 21 26 55 31
#  [5,] 12 24 43 57 17 57 26
#  [6,] 27 49 26 34 48  6 11
#  [7,] 53 36 62  6 62 36 47
#  [8,] 16 33 54 42 32 10 62
#  [9,] 37  9 15 27 37 18 56

Edit 2:

Another rule:

Each treatment group must be seen exactly once in each row and column (requiring a square experimental design).

I think this is effectively generating a sudoku-like matrix of treatment groups, and once that is satisfied, backfill rule #2 (no repeat treatments in adjacent columns). One way (though it is hasty) is suggested by https://gamedev.stackexchange.com/a/138228:

set.seed(42)
vec <- sample(9)
ind <- sapply(cumsum(c(0, 3, 3, 1, 3, 3, 1, 3, 3)), rot, x = vec)
apply(ind, 1, function(z) all(1:9 %in% z)) # all rows have all 1-9, no repeats
# [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
apply(ind, 1, function(z) all(1:9 %in% z)) # ... columns ...
# [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
ind
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#  [1,]    1    7    8    3    5    2    4    6    9
#  [2,]    5    2    3    6    9    4    8    1    7
#  [3,]    9    4    6    1    7    8    3    5    2
#  [4,]    7    8    1    5    2    3    6    9    4
#  [5,]    2    3    5    9    4    6    1    7    8
#  [6,]    4    6    9    7    8    1    5    2    3
#  [7,]    8    1    7    2    3    5    9    4    6
#  [8,]    3    5    2    4    6    9    7    8    1
#  [9,]    6    9    4    8    1    7    2    3    5

This makes a rather fixed-style of random group arrangements given the constraints on groups. Since this is a design of experiments, if you're going to use this method (and proximity between blocks is at all a concern), then you should likely randomize columns and/or rows of the ind matrix before sampling the treatments themselves. (You can do columns and rows, just do them piece-wise, and it should preserve the constraints.)

ind <- ind[sample(9),][,sample(9)]
ind
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#  [1,]    2    3    8    1    4    7    9    6    5
#  [2,]    7    8    4    6    2    9    5    3    1
#  [3,]    1    7    9    4    5    6    3    2    8
#  [4,]    8    1    6    9    3    4    2    5    7
#  [5,]    5    2    7    8    9    1    6    4    3
#  [6,]    3    5    1    7    6    8    4    9    2
#  [7,]    4    6    3    5    8    2    7    1    9
#  [8,]    6    9    5    2    1    3    8    7    4
#  [9,]    9    4    2    3    7    5    1    8    6

From here, we can enact rule 2:

treatments <- list(1:7, 8:14, 15:21, 22:28, 29:35, 36:42, 43:49, 50:56, 57:63)

mtx <- do.call(rbind, Reduce(function(prev, ind) {
  while (TRUE) {
    this <- sapply(treatments[ind], sample, size = 1)
    if (!any(prev %in% this)) break
  }
  this
}, asplit(ind, 2)[-1],
init = sapply(treatments[ind[,1]], sample, size = 1),
accumulate = TRUE))
mtx
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#  [1,]   11   44    4   52   30   15   23   41   59
#  [2,]   16   56   49    3   12   33   39   57   27
#  [3,]   52   24   60   40   46    2   20   29   13
#  [4,]    1   37   23   63   56   48   32   12   17
#  [5,]   24   10   30   16   58   39   50    2   47
#  [6,]   49   57   41   25    6   52   11   17   34
#  [7,]   59   31   19   14   38   23   47   51    7
#  [8,]   41   17   11   33   24   61    5   43   54
#  [9,]   29    4   51   45   20    8   58   28   40

I guess if I wanted the treatments not to be repeated in the same row AND adjacent columns I would be looking at a hard level Sudoku game, right? — MarioS, May 25 '21 at 11:07
What you said before was that treatment *groups* could not be in the same row in adjacent blocks; now you're saying you don't want specific *treatments* in the same column: for instance, "50" and "53" cannot be in the same row (adj col) since they're both in group 8. Now you say you can't have (just) "50" in two adjacent columns at all? — r2evans, May 25 '21 at 12:52
Hello. Apologies for the confusion but a new rule occurred to me while I was copying the table from R to the word document. I was wondering if there is a way to ensure: 1. Treatments will not be repeated in an adjacent column. (ACHIEVED) 2. Treatments will be repeated once in every column. (ACHIEVED) 3. There is at least 1 of each treatment in every row. I.e. there are three Treatment 1 in the first row but none at row number 9. Yes, to do this we can increase the number of replicates per treatment to 9 up from 7. Would this be possible? — MarioS, May 27 '21 at 15:30
That is the sudoku puzzle, and to get that level of "completeness", while you might be able to do it stochastically (as I've done above), I think it's feasible to get into a situation where you cannot fill the 9th column. I think you need an optimization-based engine for that. That's not hard (there are undergrad and grad projects to solve sudoku-like puzzles programmatically ... somewhere), but it is a very different approach. (I suspect it's a non-linear problem, but it's been a while since I've looked at those kinds of problems, and a while since grad-school.) — r2evans, May 27 '21 at 15:35
That's not to say that it cannot be achieved stochastically ... just `matrix(sample(...),...)` and repeat until all of your rules are met, but it could take a while. — r2evans, May 27 '21 at 15:35
Thank you for your answer and your help! I believe that that the process you described solving this programmatically would be out of scope for what I am doing (agriculture) but for the sake of trying I will try the solution stochastic solution you suggested. — MarioS, May 27 '21 at 16:06
MariosStamatiou, see my edit, it's one way to do what you want. It isn't a perfect generation scheme, but (1) it's random, (2) it meets all of your rules, and (3) it is very fast. It just won't generate all types of random setups possible. If it works, please consider up-voting the answer, there's been a bit of time and thinking spent on this one :-) — r2evans, May 27 '21 at 17:45
Wow, thank you so much this is a fantastic solution. I have tried upvoting your answer since day 1 but it keeps saying I need 15 rep to do so? I don't get it, since clicking on the "accept the answer" button should give me 15 points. Any ideas? — MarioS, May 27 '21 at 18:48
Try again ..... and clicking on the accept gives *me* 15 points, I think it gives *you* 2 points — r2evans, May 27 '21 at 18:50
Thank you that worked! Ah, it gives you 15 points and 2 to me. :) Thank you again. — MarioS, May 27 '21 at 18:50

How to produce a a randomized number sequence within seven block without boundary repeat using R?

1 Answers1