How to generate a pseudo-random involution?

Question

For generating a pseudo-random permutation, the Knuth shuffles can be used. An involution is a self-inverse permutation and I guess, I could adapt the shuffles by forbidding touching an element multiple times. However, I'm not sure whether I could do it efficiently and whether it generates every involution equiprobably.

I'm afraid, an example is needed: On a set {0,1,2}, there are 6 permutation, out of which 4 are involutions. I'm looking for an algorithm generating one of them at random with the same probability.

A correct but very inefficient algorithm would be: Use Knuth shuffle, retry if it's no involution.

Why the downvote? It looks like an interesting problem to me. — Rory Daulton, Aug 17 '16 at 13:48
I changed my answer, so my code is now slightly more elegant and efficient. — Rory Daulton, Aug 21 '16 at 23:58
@RoryDaulton I'm sure, your answer was worth accepting from the very beginning; it's just that I didn't have the time to understand it thoroughly. — maaartinus, Aug 22 '16 at 00:28
I wasn't fishing for you to accept my answer. I genuinely thought the question was interesting, so I spent extra time on it. I am transferring my permutations code from Borland Delphi to Python and this is now my section on involutions. Ask if you have any questions, especially if you don't know Python (which I learned recently). — Rory Daulton, Aug 22 '16 at 00:37
@RoryDaulton I'm glad you like this question. My Python is rather weak, but Python is a very readable language. I spent some time thinking if there was a solution not using `invo_count`... it feels like there should be (Knuth shuffle uses no such test while it may leave an element untouched), but everything I can imagine would be biased (as I can't see a "natural" way to get a probability like `76./26` with 7 elements only). — maaartinus, Aug 22 '16 at 00:51

Rory Daulton · Accepted Answer · 2016-08-22T18:58:37.220

Let's here use a(n) as the number of involutions on a set of size n (as OEIS does). For a given set of size n and a given element in that set, the total number of involutions on that set is a(n). That element must either be unchanged by the involution or be swapped with another element. The number of involutions that leave our element fixed is a(n-1), since those are involutions on the other elements. Therefore a uniform distribution on the involutions must have a probability of a(n-1)/a(n) of keeping that element fixed. If it is to be fixed, just leave that element alone. Otherwise, choose another element that has not yet been examined by our algorithm to swap with our element. We have just decided what happens with one or two elements in the set: keep going and decide what happens with one or two elements at a time.

To do this, we need a list of the counts of involutions for each i <= n, but that is easily done with the recursion formula

a(i) = a(i-1) + (i-1) * a(i-2)

(Note that this formula from OEIS also comes from my algorithm: the first term counts the involutions keeping the first element where it is, and the second term is for the elements that are swapped with it.) If you are working with involutions, this is probably important enough to break out into another function, precompute some smaller values, and cache the function's results for greater speed, as in this code:

# Counts of involutions (self-inverse permutations) for each size
_invo_cnts = [1, 1, 2, 4, 10, 26, 76, 232, 764, 2620, 9496, 35696, 140152]

def invo_count(n):
    """Return the number of involutions of size n and cache the result."""
    for i in range(len(_invo_cnts), n+1):
        _invo_cnts.append(_invo_cnts[i-1] + (i-1) * _invo_cnts[i-2])
    return _invo_cnts[n]

We also need a way to keep track of the elements that have not yet been decided, so we can efficiently choose one of those elements with uniform probability and/or mark an element as decided. We can keep them in a shrinking list, with a marker to the current end of the list. When we decide an element, we move the current element at the end of the list to replace the decided element then reduce the list. With that efficiency, the complexity of this algorithm is O(n), with one random number calculation for each element except perhaps the last. No better order complexity is possible.

Here is code in Python 3.5.2. The code is somewhat complicated by the indirection involved through the list of undecided elements.

from random import randrange

def randinvolution(n):
    """Return a random (uniform) involution of size n."""

    # Set up main variables:
    # -- the result so far as a list
    involution = list(range(n))
    # -- the list of indices of unseen (not yet decided) elements.
    #    unseen[0:cntunseen] are unseen/undecided elements, in any order.
    unseen = list(range(n))
    cntunseen = n

    # Make an involution, progressing one or two elements at a time
    while cntunseen > 1:  # if only one element remains, it must be fixed
        # Decide whether current element (index cntunseen-1) is fixed
        if randrange(invo_count(cntunseen)) < invo_count(cntunseen - 1):
            # Leave the current element as fixed and mark it as seen
            cntunseen -= 1
        else:
            # In involution, swap current element with another not yet seen
            idxother = randrange(cntunseen - 1)
            other = unseen[idxother]
            current = unseen[cntunseen - 1]
            involution[current], involution[other] = (
                involution[other], involution[current])
            # Mark both elements as seen by removing from start of unseen[]
            unseen[idxother] = unseen[cntunseen - 2]
            cntunseen -= 2

    return involution

I did several tests. Here is the code I used to check for validity and uniform distribution:

def isinvolution(p):
    """Flag if a permutation is an involution."""
    return all(p[p[i]] == i for i in range(len(p)))

# test the validity and uniformness of randinvolution()
n = 4
cnt = 10 ** 6
distr = {}
for j in range(cnt):
    inv = tuple(randinvolution(n))
    assert isinvolution(inv)
    distr[inv] = distr.get(inv, 0) + 1
print('In {} attempts, there were {} random involutions produced,'
    ' with the distribution...'.format(cnt, len(distr)))
for x in sorted(distr):
    print(x, str(distr[x]).rjust(2 + len(str(cnt))))

And the results were

In 1000000 attempts, there were 10 random involutions produced, with the distribution...
(0, 1, 2, 3)     99874
(0, 1, 3, 2)    100239
(0, 2, 1, 3)    100118
(0, 3, 2, 1)     99192
(1, 0, 2, 3)     99919
(1, 0, 3, 2)    100304
(2, 1, 0, 3)    100098
(2, 3, 0, 1)    100211
(3, 1, 2, 0)    100091
(3, 2, 1, 0)     99954

That looks pretty uniform to me, as do other results I checked.

rossum · Answer 2 · 2016-08-17T14:17:33.167

An involution is a one-to-one mapping that is its own inverse. Any cipher is a one-to-one mapping; it has to be in order for a cyphertext to be unambiguously decrypyed.

For an involution you need a cipher that is its own inverse. Such ciphers exist, ROT13 is an example. See Reciprocal Cipher for some others.

For your question I would suggest an XOR cipher. Pick a random key at least as long as the longest piece of data in your initial data set. If you are using 32 bit numbers, then use a 32 bit key. To permute, XOR the key with each piece of data in turn. The reverse permutation (equivalent to decrypting) is exactly the same XOR operation and will get back to the original data.

This will solve the mathematical problem, but it is most definitely not cryptographically secure. Repeatedly using the same key will allow an attacker to discover the key. I assume that there is no security requirement over and above the need for a random-seeming involution with an even distribution.

ETA: This is a demo, in Java, of what I am talking about in my second comment. Being Java, I use indexes 0..12 for your 13 element set.

public static void Demo() {

    final int key = 0b1001;

    System.out.println("key = " + key);
    System.out.println();

    for (int i = 0; i < 13; ++i) {

        System.out.print(i + " -> ");
        int ctext = i ^ key;

        while (ctext >= 13) {
            System.out.print(ctext + " -> ");
            ctext = ctext ^ key;
        }
        System.out.println(ctext);
    }

} // end Demo()

The output from the demo is:

key = 9

0 -> 9
1 -> 8
2 -> 11
3 -> 10
4 -> 13 -> 4
5 -> 12
6 -> 15 -> 6
7 -> 14 -> 7
8 -> 1
9 -> 0
10 -> 3
11 -> 2
12 -> 5

Where a transformed key would fall off the end of the array it is transformed again until it falls within the array. I am not sure if a while construction will fall within the strict mathematical definition of a function.

I want an involution for my input set, which has e.g., 13 elements, so xoring may jump of out range. Moreover, I want one arbitrary involution, not only those obtainable by xoring. Actually, what I want is *exactly what Knuth shuffle does, except for the involution constraint.* Right, there's no security requirement. — maaartinus, Aug 17 '16 at 12:00
With a limited range, you need something like the Hasty Pudding cipher method. Use the one-to-one property to get a result in the right range; just repeat the XOR operation as many times as needed to get within range. With a four bit key and thirteen elements, then you won't have to repeat too many times to get a result in range. If your elements are bigger than four bits, then just use the cipher result as an index. — rossum, Aug 17 '16 at 13:44
The problem with XOR-ing is that you actually need no `while`. Either it word, or you repeat it once and then it's undone. Note that for 13 input elements, there are only 16 possible keys, but over 100k involutions. — maaartinus, Aug 20 '16 at 23:49
Either a `while` or an `if`` is needed to check if the first XOR is out of range. The OP only appears to need one involution, and each possible key will give a different involution. Not the full range, certainly, but a few to select from. — rossum, Aug 21 '16 at 00:01
I'm the OP. What I'm looking for is exactly a thing like the Knuth Shuffle: generating *all* possible results *equiprobably*; sorry for not stating it clearly in the first version. The other answer seems to do exactly this. — maaartinus, Aug 21 '16 at 00:05

How to generate a pseudo-random involution?

2 Answers2