2

I want to give my users a concise, base-64 alphanumeric code, to represent the choice they make when choosing 1024 candidates in order from a ballot of 1024 candidates. (This is a worst-case ... I can probably live with < 256).

What are my options?

A naive approach tells me that if each of the 1024 element can be represented by a unique ordinal (2 ^ 10), and I need a series of 1024 ordinals, then 10-bits x 1024 positions = 10,240 bits would do the trick. But that's still 1707 radix-64 digits, which is a bit longer than I was hoping for, and it feels like I am wasting bits 9 bits to represent '1' (though I am probably wrong about that).

Classic permutation theory ought to tell me the number of possibilities - nPr (order is significant, no duplicates). But the number is so very big, it bothers my small brain, and overwhelms my dec<->bin calculators. Do I get away with fewer bits if I do it this way? (Why not? Math sucks eh?)

Bonus points for Java code so I can tinker with n and r to find the number of bits required, and the number of radix-64 digits. :-)

PS. This is feasibility-testing for my serious proposal for a voting system which uses paper for an audit trail, and computers for fast counting.

tshepang
  • 12,111
  • 21
  • 91
  • 136
David Bullock
  • 6,112
  • 3
  • 33
  • 43
  • 1
    Presumably you'll also have to think about how to represent the "I do not care about the ordering of the other candidates beyond the **`n`** I have selected" value for each ballot (as for candidates who are left blank in STV) – AakashM Sep 12 '13 at 09:58
  • 1
    In the 2013 Australian federal elections, voters in the state of NSW had to elect 6 senators from a list of 110 candidates, and we have to put them all in sequence, so that *somebody* get elected even if nobody has a majority. Many people opt to defer to the ordering suggested by their favourite party, but the parties do stupid 'preference deals' and can't be trusted to order them according to alignment of values. So we have to do it ourselves. Simply transcribing the already-considered numbers correctly onto the very-large ballot paper is an error-prone chore, and needs to be simplified. – David Bullock Sep 12 '13 at 13:38

1 Answers1

3

Wolfram alpha will tell you that you need ⌈log2(1024!)⌉=8,770 bits in an optimal representation. Not that much better than naively storing the elements themselves. The conceptually easiest way to achieve such a slightly more concise representation is to imagine all those permutations ordered lexicographically. Then you can simply store the index of the permutation in that list. To make this practical, you can iterate over the elements of the permutation, and at each position ask yourself how many permutations there are which have all the preceeding positions in common, but a lesser value at the current position. Summing these numbers will result in the zero-based index for the permutation.

Since you asked about Java code for this, here is a piece of code which will determine the number of bits required, and which will also compute the concise representation for a given permutation.

import java.math.BigInteger;
class SO18757672 {
    int n;
    BigInteger[] fac;
    SO18757672(int n) {
        this.n = n;
        fac = new BigInteger[n + 1];
        fac[0] = BigInteger.ONE;
        for (int i = 1; i <= n; ++i)
            fac[i] = fac[i - 1].multiply(BigInteger.valueOf(i));
    }
    int bitsRequired() {
        return fac[n].subtract(BigInteger.ONE).bitLength();
    }
    BigInteger codeFor(int[] permutation) {
        if (permutation.length != n)
            throw new IllegalArgumentException("wrong length");
        BigInteger res = BigInteger.ZERO;
        for (int i = 0; i != n; ++i) {
            int pi = permutation[i];
            if (pi < 0 || pi > n - i)
                throw new IllegalArgumentException("invalid value");
            res = res.add(fac[n - 1 - i].multiply(BigInteger.valueOf(pi)));
            for (int j = i + 1; j != n; ++j) {
                if (permutation[j] == pi)
                    throw new IllegalArgumentException("duplicate value");
                if (permutation[j] > pi)
                    --permutation[j]; // We modify out input!
            }
        }
        return res;
    }
    boolean sanityChecks() {
        int[] p = new int[n];
        // smallest code is zero, for all elements in order
        for (int i = 0; i != n; ++i)
            p[i] = i;
        assert BigInteger.ZERO.equals(codeFor(p));
        // largest code is n!-1, for all elements in reverse order
        for (int i = 0; i != n; ++i)
            p[i] = n - 1 - i;
        assert fac[n].subtract(BigInteger.ONE).equals(codeFor(p));
        return true; // so we can use this in an assert call
    }
    public static void main(String[] args) {
        SO18757672 instance = new SO18757672(1024);
        System.out.println(instance.bitsRequired() + " bits required");
        assert instance.sanityChecks();
    }
}
MvG
  • 57,380
  • 22
  • 148
  • 276