Title: A connection between self-reference and Turing-completeness.

Turing Completeness

Let a programming language L be given. We say that a function f is L-computable if there is a program in L that computes f.

We say that a programming language L is Turing-complete if every total computable function is L-computable.

LOOP is an example of a programming language that can compute all primitive recursive functions. If one adds to LOOP an unbounded search operator or simply while loops, then the language becomes Turing-complete.

Notice that LOOP programs only compute numeric functions. Don't we sometimes consider computable functions that map from strings to strings? Well, all strings can be encoded by Gödel numbers so everything's ok, but it can get a little complicated to always have to encode and decode strings rather than just manipulating strings directly.

Random Access Machines

Let's consider other programming languages that operate solely on strings.

There are several programming languages for modeling random access machines. One such language is discussed in detail here. It is also defined in the notes on RAM programs here.

The random access machines modeled by this language can:
(1) assign a string to a register,
(2) delete the left most character of a string,
(3) append a character to the right end of a string, and
(4) execute conditional jmp instructions (i.e. conditional goto instructions) that allow you to jump to a previous or later instruction conditional on the left most character stored in a given register.

To show that this programming language is Turing-complete, it suffices to show that the four abilities are all we need to simulate any given Turing machine because each total computable function can be computed by some Turing machine.

Brief Explanation: Let a Turing machine be given. The current tape and its active cell can be represented by a string. The current state can be represented by a line number of the program's source code. We can scan across the string by removing left characters and appending them to the right of a temporary string. As we scan through, we eventually read the active cell and decide how to manipulate the tape and what line number (i.e. state) to jump to next.

Since we can simulate any Turing machine, this RAM programming language is Turing-complete.

Summary of Everything

In the previous section, we showed how to use string substitution to build self-referential programs. And, in the second section, we showed how to use self-referential programs and a universal program to simulate recursion. Next, we will show how to use self-referential programs and a universal program to simulate any given Turing machine.

Turing-completeness via Self-Reference

Consider the string substitution function Sub(u,v,x) that replaces each occurrence of u with v in x.

Theorem: If a programming language can (1) perform variable assignment, (2) compute Sub, and (3) has a universal program, then it is Turing-complete.

We will justify the theorem by showing that such a programming language can simulate all four abilities of the preceding RAM programs.

(1) Assigning a fixed string to a register or copying a string from one register to another can be simulated using variable assignment.

(2) The following code deletes the left most character of a string x. To simplify things, let's assume that x is a binary string, and y is the output.

var y = x;

y = Sub("0", "B0B", y);
y = Sub("1", "B1B", y);

y = Sub("BB", "", y);
y = Sub("B0", "", y);
y = Sub("B1", "", y);
y = Sub("B", "", y);

(3) The following code appends a character c to the right end of a string x. Let's assume that "A" and "B" do not occur in x, and y is the output.

var y = "AB";

y = Sub("A", x, y);
y = Sub("B", "c", y);

(4) In order to simulate conditional jmp instructions we need to use self-reference and a universal program which we denote by eval. We will break up the simulation into three parts.

(Part a) First, we need to compute the left most character of a string x. Let's assume that x is a binary string, and y is the output.

var y = x;

y = Sub("0", "0B", y);
y = Sub("1", "1B", y);

y = Sub("B0", "", y);
y = Sub("B1", "", y);
y = Sub("B", "", y);

(Part b) Next, we need to decide what to do next based on the left most character of x that we computed. Essentially, we need to simulate an if statement. Let's assume that we execute code block1 if it is 0 and we execute code block2 if it is 1.

var c = leftMostChar(x);

var temp = "*B*";
var case1 = Sub("B", "0", temp);
var case2 = Sub("B", "1", temp);
var ourCase = Sub("B", c, temp);

/* To handle special characters correctly, code must be given in hex */
var block1 = decode(hex1);
var block2 = decode(hex2);

/* Exactly one of the two substitutions does anything */
var code = Sub(case1, block1, ourCase);
code = Sub(case2, block2, code);

/* Evaluate either block1 or block 2 */

In the preceding, it's not possible for both substitutions to do something as long as the strings *0* and *1* don't occur in block1 and block2. Also, the decode function can be simulated by substituting hex codes with corresponding characters.

(Part c) Finally, we can simulate jumps to previous or later lines of code by using self-reference to compute the program's own source code and then using the universal program to execute the later or previous line of code.

Further Detail

How you implement (Part c) can largely influence your entire code structure so we will provide pseudo code for one valid high-level approach.

Consider a program in our RAM programming language with register variables v1, v2, ..., vm and lines of code c1, c2, ..., cn.

Consider the following code template:

/* We use self-reference to compute this template code */
var src = getThisCode();

/* We need to substitute for these blanks to get the current RAM program variable values */
var v1 = "***v1***";
var v2 = "***v2***";
var vm = "***vm***";

/* Current line to be executed */
var lineNumber = "***lineNum***";
var lineInstruction = "***instruction***";

/* Block of if statements to determine which line to goto next */
if(lineNumber == "c1"){ ... }
else if(lineNumber == "c2"){ ... }
else if(lineNumber == "c3"){ ... }
else if(lineNumber == "cn-1"){ ... }
else if(lineNumber == "cn"){ ... }

It's ok to have the block of if statements because we already know from (Part b) that we can simulate such blocks of if statements.

Within each if statement's code block, we use substitution to fill in the blanks in the template code src. Then, we evaluate the code. However, when lineNumber == "cn", the code block is different. In that case, we reached the end of the RAM program so we just evaluate the cn instruction and we are done.

Thank you for reading! :)

Back to Programs