Can a Turing machine be constructed having only two tape symbols?

Question

A Turing machine M containing any number of tape symbols can be simulated by one M' containing just three tape symbols: {0, 1, B} (B = Blank).

Can M be be simulated by a M" that has just two tape symbols, say {1, B}?

B is the blank symbol that is required to describe any turing machine .. (since the TM has infinite length tape .. beyond a certain point it is all filled with blank symbols) — AnkurVj, Jan 27 '11 at 18:56
I asked this question again while requesting a more specific answer based on scientific literature. It also contains an attempt at describing a conversion in more detail: https://stackoverflow.com/questions/56990710/encoding-any-turing-machine-to-one-with-2-symbols — Cerno, Jul 11 '19 at 13:56

score 7 · Accepted Answer · answered Jan 27 '11 at 19:28

The first step - getting from any TM to a TM with just ones and zeros - is not as hard as you might think but not as easy as what everyone else is saying. The idea is to develop a fixed-length binary encoding for each of the symbols in the alphabet. You then update the finite-state control so that at each step the TM scans the appropriate number of bits, decides which way to move and which symbol to write, writes the binary representation of the new symbol, and repeats. This can be done by having a HUGE finite-state control, and I'll leave the details to the reader since it's really pedantic to go over how it works. :-). The one detail to note is that in this construction you represent the blank symbol as a sequence of blanks with the same length as the binary symbols you invented.

To implement a TM using just 1 and B you use a similar trick. First, reduce the TM to use just 1, 0, and B. We'll again reduce the symbol set by a smaller one, but will have to be a bit more clever with how we do so because the tape has infinitely many blanks on it. We will use the following encoding:

11 encodes the number 1
1B encodes the number 0
BB encodes a blank.

As we run the TM using this encoding scheme, if we ever walk past the end of the previously-visited tape, we will encounter infinitely many blank symbols, which fortunately correspond to our encoding of the blank symbol.

The only challenge is how to encode the input to this new TM so that we can convert it to this above format. Since blanks can't appear in the TM input, the input must be encoded in unary. We then stipulate that for any binary string w of the old machine, the input to the new machine should be the unary encoding of the number 1w. We then have the first step of the machine be to convert from the unary encoding to the above binary encoding. This can be done with just two symbols, but the details are really hard and again I'm going to punt on them. You can work out the details if you want.

Hope this helps!

I have a general question: In the theoretical SC community, which types of encodings are "allowed" to do by hand and which must be done within the TM? You say that the machine must convert from unary to our special binary but we allow the conversion from the original binary to unary to be done by the operator. So I assume manual conversions are allowed up to a degree. But if we allow any arbitrarily complex conversion to be done by the operator, they could compute the whole TM program by hand and call that "input conversion", so there must be a threshold. Is there a rule of thumb about where? — Cerno, Jul 11 '19 at 11:47
@Cerno There's a number of details to consider here. For example, suppose you want to make a two-symbol TM, where one symbol is a blank. Since most definitions of a TM require that the input string cannot contain the blank symbol, the only "legal" way to start up the machine would be to write the input in unary. From there, the question becomes "I have a machine where the only inputs I can provide must be written in unary, so how do I choose to encode more complex objects?" That would then lead you to design an encoding scheme that fits your needs. (continued...) — templatetypedef, Jul 11 '19 at 16:39
@Cerno So in that sense, it's less about "the operator has to do some computation in advance" and more about "in defining the expected behavior of the TM, we'd have to work out what an input represents and what output we'd like to see." We'd need to say, for each possible input, whether that input should be accepted or not accepted. It helps to try to think about this from the perspective of languages. What set of strings would you like the TM to accept? If you try making encodings of the form "TMs that halt go to even-length strings and others go to odd-length strings", (continued...) — templatetypedef, Jul 11 '19 at 16:41
@Cerno ... then you'll find that your set-builder notation starts looking a bit off, or that you'll have trouble pinning things down. Hope this helps! — templatetypedef, Jul 11 '19 at 16:42

Yochai Timmer · Answer 2 · 2011-01-27T19:10:31.870

0

The first one is easy, think of a computer, binary....

In the first one you can encode each symbol into a 0,1 representation.

In the second one, you can do 2 things:

Think of the B as 0 ... it doesn't matter what you call it... and then you have a 0,1 machine and can encode whatever you want.
Encode the symbols as a series of ones, separated by a B. The N'th symbol will hold N 1's

edited Jan 27 '11 at 19:10

answered Jan 27 '11 at 18:53

Yochai Timmer

48,127
24
147
185

yes the first one is not a problem and is easily formally provable . if there are 2^k tape symbols in M then M' can use k cells each to store one symbol of M's tape alphabet. The blank can also be encoded in a simliar manner. The problem is with reducing tape symbols to two – AnkurVj Jan 27 '11 at 19:05

Can a Turing machine be constructed having only two tape symbols?

2 Answers2