Background
A unary encoding of the input uses an alphabet of size 1: think tally marks. If the input is the number a
, you need O(a)
bits.
A binary encoding uses an alphabet of size 2: you get 0s and 1s. If the number is a
, you need O(log_2 a)
bits.
A trinary encoding uses an alphabet of size 3: you get 0s, 1s, and 2s. If the number is a
, you need O(log_3 a)
bits.
In general, a k-ary encoding uses an alphabet of size k
: you get 0s, 1s, 2s, ..., and k-1
s. If the number is a
, you need O(log_k a)
bits.
What does this have to do with complexity?
As you are aware, we ignore multiplicative constants inside big-oh notation. n
, 2n
, 3n
, etc, are all O(n)
.
The same holds for logarithms. log_2 n
, 2 log_2 n
, 3 log_2 n
, etc, are all O(log_2 n)
.
The key observation here is that the ratio log_k1 n / log_k2 n
is a constant, no matter what k1
and k2
are... as long as they are greater than 1
. That means f(log_k1 n) = O(log_k2 n)
for all k1, k2 > 1
.
This is important when comparing algorithms. As long as you use an "efficient" encoding (i.e., not a unary encoding), it doesn't matter what base you use: you can simply say f(n) = O(lg n)
without specifying the base. This allows us to compare runtime of algorithms without worrying about the exact encoding you use.
So n = b
(which implies a unary encoding) is typically never used. Binary encoding is simplest, and doesn't provide a non-constant speed-up over any other encoding, so we usually just assume binary encoding.
That means we almost always assume that n = lg a + lg b
as the input size, not n = a + b
. A unary encoding is the only one that suggests linear growth, rather than exponential growth, as the values of a
and b
increase.
One area, though, where unary encodings are used is in distinguishing between strong NP-completeness and weak NP-completeness. Without getting into the theory, if a problem is NP-complete, we don't expect any algorithm to have a polynomial running time, that is, one bounded by O(n**k)
for some constant k
when using an efficient encoring.
But some algorithms do become polynomial if we allow a unary encoding. If a problem that is otherwise NP-complete becomes polynomial when using an unary encoding, we call that a weakly NP-complete problem. It's still slow, but it is in some sense "faster" than an algorithm where the size of the numbers doesn't matter.