2

I am trying to generate all permutations for an array using the Heap's algorithm that I found in wikipedia.

This is what I tried so far:

n <- 3
A <- c(1, 2, 3)
perm <- function(n, A) {
  if (n == 1)
    print(perm)
  for (i in length(A))
    perm(n, A - 1) 
  if (A %% 2 == 1) {
    swap(A[i], A[n - 1])
  } else {
    swap(A[0], A[n - 1])
  } 
}
perm(3, A)

The output is not showing, and it would be great to receive some help.

jay.sf
  • 60,139
  • 8
  • 53
  • 110
aycee
  • 49
  • 3

2 Answers2

4

9In addition to some misuse of names, there are four fundamental issues with the provided code:

  1. The implementation of Heap's algorithm in Wikipedia assumes 0-based indexing, while R's vectors are 1-based. So A[0] needs to be translated to A[1] (that is, the first element in the array), while A[n-1] needs to be translated to A[n] (which is the last element in the n-prefix of A).

  2. As is surprisingly common, the code does not follow the iterative structure of the Wikipedia code. Stripped to its essence, the correct code is (changed to 1-based indexing):

    define heap(A, n):
      if n == 1: process A
      else:
        for i from 1 to n - 1:    ## Note the n - 1
          heap(A, n - 1)          ## Recurse
          swap A[n] with ...      ## Swap
        end for
        heap(A, n - 1)            ## Recurse
      end if
    

    The important aspect of this loop is that the swap is done in between recursions, neither before nor after. So there are n recursive calls (assuming n > 1) but only n - 1 swaps. The consequence is that there is exactly one swap between two successive permutations, which is the essence of the "Gray code" requirement which Heap's algorithm satisfies.

  3. The various pseudocode implementations (and the concrete implementations in Python) assume that the array argument is actually passed by reference, so that swaps in the recursive calls are visible to the caller. But R passes arrays by value, so every recursive call has its own instance of the array, and the modifications are not reflected in the caller's array.

  4. The computation of the swap is as follows:

    • If n is even, A[n] is swapped with A[1].
    • If n is odd, A[n] is swapped with A[i].

    (Remember that the loop limit is n - 1, not n, and n is always greater than 1. So A[n] is never swapped with itself.)

    The provided code gets this backwards.

To get around the issue with making a copy of the array at every recursive call, we can use a closure containing the array; the only argument for the recursive call is then the prefix length. But note that modifying the array in the closure's environment requires that we use the <<- operator.

heap_perm <- function(A) {
  h <- function(n) {
    if (n == 1) {
      print(A)
    } else {
      h(n - 1)
      for (i in 1:(n - 1)) {
        if (n %% 2 == 1) pivot <- 1 else pivot <- i
        tmp <- A[n]; A[n] <<- A[pivot]; A[pivot] <<- tmp
        h(n - 1)
      }
    }
  }
  
  h(length(A))
}

When testing implementations of Heap's algorithm, it's important to use a vector of at least 4 elements. (A good test would use a larger vector, but four is a good start.) When examining the output, you need to check that:

  • The correct number of permutations are produced (24 in the case of a vector of 4 elements);
  • Each generated permutation is unique;
  • Two consecutive permutations differ only in a single swap.

Verifying all three of these conditions will avoid the most common bugs in implementations of Heap's algorithm.

Here's the test output of the above R program:

> heap_perm(0:3)
[1] 0 1 2 3
[1] 1 0 2 3
[1] 2 0 1 3
[1] 0 2 1 3
[1] 1 2 0 3
[1] 2 1 0 3
[1] 3 1 0 2
[1] 1 3 0 2
[1] 0 3 1 2
[1] 3 0 1 2
[1] 1 0 3 2
[1] 0 1 3 2
[1] 0 2 3 1
[1] 2 0 3 1
[1] 3 0 2 1
[1] 0 3 2 1
[1] 2 3 0 1
[1] 3 2 0 1
[1] 3 2 1 0
[1] 2 3 1 0
[1] 1 3 2 0
[1] 3 1 2 0
[1] 2 1 3 0
[1] 1 2 3 0
rici
  • 234,347
  • 28
  • 237
  • 341
0

To implement Heap's algorithm in R, here is a solution that outputs the result in a matrix for further processing:

h <- function(s, n, e, ...) {
  ## processes Heap's algorithm
  if (n == 1L) {
    e$i <- e$i + 1L
    e$r[e$i, ] <- e$s
  } else {
    h(s, n - 1L, e)
    for (i in 1:(n - 1L)) {
      if (n %% 2L == 0L) {
        j <- i
      } else {
        j <- 1L
      }
      tmp <- e$s[n]
      e$s[n] <- e$s[j]
      e$s[j] <- tmp
      h(s, n - 1L, e)
    } 
  }
}

hperm <- function(s) {
  ## wrapper
  e <- environment()
  l <- length(s)
  i <- 0L
  e$r <- matrix(NA, factorial(l), l)
  h(s, l, e, i)
  return(e$r)
}
    

Gives:

A <- 0:3

hperm(A)
#       [,1] [,2] [,3] [,4]
#  [1,]    0    1    2    3
#  [2,]    1    0    2    3
#  [3,]    2    0    1    3
#  [4,]    0    2    1    3
#  [5,]    1    2    0    3
#  [6,]    2    1    0    3
#  [7,]    3    1    0    2
#  [8,]    1    3    0    2
#  [9,]    0    3    1    2
# [10,]    3    0    1    2
# [11,]    1    0    3    2
# [12,]    0    1    3    2
# [13,]    0    2    3    1
# [14,]    2    0    3    1
# [15,]    3    0    2    1
# [16,]    0    3    2    1
# [17,]    2    3    0    1
# [18,]    3    2    0    1
# [19,]    3    2    1    0
# [20,]    2    3    1    0
# [21,]    1    3    2    0
# [22,]    3    1    2    0
# [23,]    2    1    3    0
# [24,]    1    2    3    0

See also: This Python code of Heap's algorithm on Stack Overflow.

jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • thank you lots for the help! i actually used `swap()` from `seqinr` but i think i made a mistake, i'll try and figure it out again. – aycee Jan 01 '22 at 10:53
  • @aycee Interesting package, didn't know it! Please consider to [mark answers accepted](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235) if you found it useful. Cheers! – jay.sf Jan 01 '22 at 11:02
  • If that's the actual printout, your implementation is clearly incorrect. – rici Jan 01 '22 at 15:11
  • Also, the question you link to has incorrect pseudocode for the algorithm. – rici Jan 01 '22 at 15:17
  • @rici Thanks for the review! Would you mind explaining something more specific? In fact, with both sources, Wiki and the linked question, I had trouble figuring out the exact algorithm. – jay.sf Jan 01 '22 at 15:22
  • I was just wondering why `3` was stuck. – Chris Jan 01 '22 at 15:22
  • @chris: because R array indexes are from 1 to N, not 0 to N-1. Heap's algorithm always swaps the last element of the array; in a 0-indexed array, that would be `array[n-1]`, but in R you need to translate that to `array[n]` – rici Jan 01 '22 at 15:26
  • That's in addition to the fact that this program does an extra swap compared to the correct version, which is the classic implementation bug. See https://stackoverflow.com/a/29044942/1566221 – rici Jan 01 '22 at 15:27
  • And hopefully the OP also takes away that is it imprudent to name variables with existing base function names, which @jay.sf sought to do. Extra swap, okay. – Chris Jan 01 '22 at 15:30
  • @rici hi, thanks for the clarification! could you elaborate on the extra swap? I tried implementing `array[n]` which works for 2 elements but when I use 3, the output is `1 2 3` `2 2 3` `3 2 3` `2 2 3` `3 2 3` `2 2 3` I'm still new to this, so I'm grateful for any help. thanks! – aycee Jan 01 '22 at 15:58
  • @aycee: Look at the WP pseudocode. The last thing the function does before it returns is a recursive call. Look at your code. The last thing the function does before it returns is a swap. That's the extra swap. (Pay particular attention to the loop limits, and remember that you have to correct for R's 1-based indexing.) – rici Jan 01 '22 at 16:10
  • 1
    @jay.sf: (1) You got the even/odd condition reversed. (2) The problem with translating from Python to R is that R passes arrays by value. The original algorithm depends on the semantics of (many) other programming languages, in which arrays are passed by reference. As written, the algorithm depends on the array being modified by the recursive call. (Also, testing Heap's algorithm with arrays of length 3 almost never shows you all the problems; you need to test with at least 4. But here you can see the problem, too: the difference between line 2 and line 3 of the output is not a single swap.) – rici Jan 01 '22 at 17:35
  • 1
    Note that the key feature of Heap's algorithm is that it produces a "gray code", aka minimal modification. Each successive permutation differs from the previous one only in a single swap. This is not true of the more common lexicographic-order algorithm. – rici Jan 01 '22 at 17:36
  • @rici Thanks much for posting a solution. This made me realize that it was an environment problem. – jay.sf Jan 02 '22 at 05:03