Why is Befunge considered hard to compile?

Question

One of the design goals of Befunge was to be hard to compile. However, it is quite easy to interpret. One can write an interpreter in a conventional language, say C. To translate a Befunge program to equivalent machine code, one can hard-code the Befunge code into the C interpreter, and compile the resulting C program to machine code. Or does "compile" mean something more restricted which excludes this translation?

Some people fiercely claim that only a certain narrow (rarely precisely defined) set of translations counts as "compiled". This being a matter of definition, it's hard to proof this wrong, but fact is, these definitions are rarely useful in that they rarely imply any interesting properties that aren't also implied by broader definitions. — , Jan 05 '14 at 16:16
Any language that can be interpreted can be compiled. Whether it can produce code much more efficient than an interpreter is another matter. — Rob, Jan 05 '14 at 16:16
@SLaks : what I described is something which takes Befunge code as input and produces equivalent machine code as output. — Prateek, Jan 05 '14 at 16:17
Converting to C is compilation in itself. Compilers do their work before the code is run. Interpreters do their thing as the code is run. — Rob, Jan 05 '14 at 16:18
@Prateek: No; you're still interpreting the original Befunge code at runtime. — SLaks, Jan 05 '14 at 16:19
@SLaks You appear to confuse the befunge interpreter written in C, and the tool which combines the befunge source code with the aforemened interpreter's source code, yielding a second C program. The former is an interpreter, and described as such even in the question. The latter is a compiler under any common definition of "compiler" that I can think of. — , Jan 05 '14 at 16:23

score 2 · Answer 1 · answered Jan 10 '16 at 19:44

2

To translate a Befunge program to equivalent machine code, one can hard-code the Befunge code into the C interpreter, and compile the resulting C program to machine code.

Yes, sure. This can be applied to any interpreter, esoteric language or not, and under some definitions this can be called a compiler.

But that's not what is meant in "compilation" in the context of Befunge - and I'd argue that calling this a "compiler" is very much missing the point of compilation, which is to convert code in some (higher) language to semantically equivalent code in some other (lower) language. No such conversion is being done here.

Under this definition, Befunge is indeed a hard language to convert in such a way, since given an instruction it's hard to know - at compile time - what the next instruction will be.

answered Jan 10 '16 at 19:44

Oak

26,231
8
93
152

So is "converting code in some (higher) language to semantically equivalent code in some other (lower) language" (your definition of compiling) a different notion from "taking Befunge source code as input and outputting equivalent machine code" (which is what my alleged compiler does)? Or do you disagree that my alleged compiler does that? I'm trying to figure out where we differ here. – Prateek Jan 10 '16 at 21:33
@Prateek there's a difference between what something does and how it works. What does your compiler do? The same things traditional compilers do - takes input code and produces equivalent machine code. But *how* does it work? It doesn't actually convert any Befunge instructions to machine code. In fact, **it doesn't even look at the Befunge input** - it's basically `gcc -D`; a pure C compiler. So yes, under the "how it works" category, your compiler represents a very different compilation approach from, say, how gcc handles C. The "hard to compile" design goal was likely about this "how" part. – Oak Jan 11 '16 at 10:34

score 2 · Answer 2 · answered Apr 02 '16 at 12:47

Befunge is impossible to really AOT compile due to p. In terms of JITs, it's a cakewalk compared to all those dynamic languages out there. I've worked on some fast implementations.

marsh gains it's speed by being a threaded interpreter. In order to speed up instruction dispatch it has to create 4 copies of the interpreter, for each direction. I optimize bounds checking & lookup by storing the program in a 80x32 space instead of a 80x25 space

bejit was my observation that the majority of program time is spent in moving around. bejit records a trace as it interprets, & if the same location is ever hit in the same direction we jump to an internal bytecode format that the trace recorded. When p performs a write on program source that we've traced, we drop all traces & return to the interpreter. In practice this executes stuff like mandel.bf 3x faster. It also opens up peephole optimization, where the tracer can apply constant propagation. This is especially useful in Befunge due to constants being built up out of multiple instructions

My python implementations compile the whole program before executing since Python's function's bytecode is immutable. This opens up possibility of whole program analysis

funge.py traces befunge instructions into CPython bytecode. It has to keep an int at the top of the stack to track stack height since CPython doesn't handle stack underflow. I was originally hoping to create a generic python bytecode optimizer, but I ended up realizing that it'd be more efficient to optimize in an intermediate format which lacked jump offsets. Besides that the common advice that arrays are faster than linked lists doesn't apply in CPython as much since arrays are arrays of pointers & a linked list will just be spreading those pointers out. So I created funge2.py

(wfunge.py is a port of funge.py in preparation for http://bugs.python.org/issue26647)

funge2.py traces instructions into a control flow graph. Unfortunately we don't get to have the static stack adjustments that the JVM & CIL demand, so optimizations are a bit harder. funge2.py does constant folding, loop unrolling, some stack depth tracking to reduce stack depth checks, & I'm in the process of adding more (jump to jump optimizations, smarter stack depth juggling, not-jump or jump-pop or dup-jump combining)

By the time funge2 gets to optimizing Befunge, it's a pretty simple IR

load const
binop (+, -, *, /, %, >)
not
pop
dup
swap
printint/printchar/printstr (the last for when constant folding makes these deterministic)
getint/getchar
readmem
writemem
jumprand
jumpif
exit

Which doesn't seem so hard to compile

score 0 · Answer 3 · answered Jan 09 '16 at 18:53

0

Befunge is hard to compile due to the p and g commands. With these you can put and get commands during runtime, i.e. write self-altering code.

There is no way you can translate that directly to assembly, let alone binary code.

If you embed a Befunge-program into the interpreter code and compile that, you are still compiling the interpreter, not the Befunge-program...

answered Jan 09 '16 at 18:53

steffen

8,572
11
52
90

The proposed compiler still takes as input Befunge source code, and outputs equivalent machine code. I would assume the thing that is being compiled is by definition the input to the compiler. Which in this case, is the Befunge source code, not the interpreter. – Prateek Jan 09 '16 at 18:57

Why is Befunge considered hard to compile?

3 Answers3