3

There's multiple questions here already, but I'll still proceed. This is a simple BrainFuck interpreter. I figured out all the other symbols, but I can't figure out how to implement loops. Can anyone help?

package com.lang.bfinterpreter;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;


import com.lang.exceptions.TapeSizeExceededException;


public class Interpreter {

    private Interpreter() {
        super();
    }

    private static String getCode(final String inputFile) throws IOException {
        String code = "";
        
        // store the entire code
        final BufferedReader br = new BufferedReader(new FileReader(inputFile));
        for (String line = br.readLine(); line != null; line = br.readLine()) {
            code += line;
        }
        br.close();

        return code;
    }
    
    public static void interpret(final String inputFile) throws IOException,TapeSizeExceededException,IndexOutOfBoundsException {
        // get the program as a string
        final String code = getCode(inputFile);

        // create the Turing tape (static size)
        Character[] tape = new Character[12000];
        Integer indexPointer = 0;
        for (int i = 0; i != 12000; i++) {
            switch (code.toCharArray()[i]) {
                case ',':
                    tape[indexPointer] = (char) System.in.read();
                    break;
                
                case '.':
                    System.out.println(tape[indexPointer]);
                    break;

                case '+':
                    tape[indexPointer]++;
                    break;

                case '-':
                    tape[indexPointer]--;
                    break;

                case '>':
                    if (indexPointer == 11999) {
                        throw new IndexOutOfBoundsException();
                    }
                    else {
                        indexPointer++;
                    }
                    break;

                case '<':
                    if (indexPointer == 0) {
                        throw new IndexOutOfBoundsException();
                    }
                    else {
                        indexPointer--;
                    }
                    break;
                    
                case '[':
                    // I have a feeling I'll need stack to store nested loops   
                    break;
                    
                    case ']':
                    // I have a feeling I'll need stack to store nested loops   
                    break;
                    
                default:
                    break;
            }
        }

    } 
}

I have a feeling that I will need to use Stack, but I just can't seem to figure out how. I have constructed expression evaluators before... will this require the same logic?

justanotherguy
  • 399
  • 5
  • 16

2 Answers2

1

The most challenging part, I suppose, is finding the matching brackets. After you find where the matching bracket is, you can just check tape[indexPointer]'s value, and set i to the position after it, which should be rather easy to do.

Given an opening bracket at index i in code, to find its matching close bracket, you just need to go to the right of i in code. You start with an stack with a single [ in it - this is the [ at i. Every time you encounter a new [, you push it onto the stack. Every time you encounter a ], you pop a [ from the stack - this ] you encountered matches the [ you popped! When you popped the last [ from the stack (i.e. when the stack becomes empty), you know you have found the matching close bracket of the open bracket at i.

In code, you don't even need a Stack. You can just use an int to encode how many elements are in the stack - increment it when you push, decrement it when you pop.

private static int findMatchingCloseBracketAfterOpenBracket(char[] code, int openBracketIndex) {
    // parameter validations omitted
    int stack = 1;
    for (int i = openBracketIndex + 1; i < code.length ; i++) {
        if (code[i] == '[') {
            stack++;
        } else if (code[i] == ']') {
            stack--;
        }
        if (stack == 0) {
            return i;
        }
    }
    return -1; // brackets not balanced!
}

To find the matching [ of a ], the idea is same, except you go the other direction, and reverse the push and pop actions.

private static int findMatchingOpenBracketBeforeCloseBracket(char[] code, int closeBracketIndex) {
    // parameter validations omitted
    int stack = 1;
    for (int i = closeBracketIndex - 1; i >= 0 ; i--) {
        if (code[i] == '[') {
            stack--;
        } else if (code[i] == ']') {
            stack++;
        }
        if (stack == 0) {
            return i;
        }
    }
    return -1; // brackets not balanced!
}

(Refactoring the code duplication here is left as an exercise for the reader)

Sweeper
  • 213,210
  • 22
  • 193
  • 313
0

Updated: here's example code. Before the main execution loop you scan the whole program for matches and store them in an array:

    Stack<Integer> stack = new Stack<>();
    int[] targets = new int[code.length];
    for (int i = 0, j; i < code.length; i++) {
        if (code[i] == '[') {
            stack.push(i);
        } else if (code[i] == ']') {
            if (stack.empty()) {
                System.err.println("Unmatched ']' at byte " + (i + 1) + ".");
                System.exit(1);
            } else {
                j = stack.pop();
                targets[i]=j;
                targets[j]=i;
            }
        }
    }
    if (!stack.empty()) {
        System.err.println("Unmatched '[' at byte " + (stack.peek() + 1) + ".");
        System.exit(1);
    }

And then inside the main execution loop you just jump to the precomputed location:

            case '[':
                if (tape[indexPointer] == 0) {
                    i = targets[i];
                }
                break;
                
            case ']':
                if (tape[indexPointer] != 0) {
                    i = targets[i];
                }
                break;

(Note, we jump to the matching bracket, but the for loop will still autoincrement i as usual, so the next instruction that gets executed is the one after the matching bracket, as it should be.)

This is much faster than having to scan through a bunch of code looking for the matching bracket every time a bracket gets executed.

I notice also: you probably want to convert code into an array once, and not once per instruction you execute. You probably want to run your "for" loop while i < codelength, not 12000, and you also probably want to compute codelength only once.

Definitely '.' should output only one character, not add a newline as well. Also, 12000 bytes of array is too small. 30000 is the minimum, and much larger is better.

Good luck!

  • Hmm... kinda new to Java. Is using a Vector or a List instead of the Array a good idea? Also, how do you create variable arrays in java? In C++, I would do `char* array` – justanotherguy Apr 29 '22 at 05:59
  • Also not super versed in Java. But I think arrays are faster, and I think you can just do: char[] code2 = code.toCharArray(); int[] targets = new int[code2.length]; – Daniel Cristofani Apr 29 '22 at 07:54
  • You were already using toCharArray to make code into an array, you were just doing it once per executed instruction and then discarding the resulting arrays. – Daniel Cristofani Apr 29 '22 at 07:57
  • I think Java compiler, unless it does not optimize like good ol' LLVM/GCC/MSVC, should cache the result essentially calling toCharArray only once. – justanotherguy Apr 29 '22 at 08:02
  • I finally got around to testing this, and no, at least for OpenJDK, it converts it repeatedly, producing a bad slowdown. – Daniel Cristofani May 02 '22 at 09:39
  • Typical of OpenJDK. Try using Microsoft HotSpot or Amazon Corretto – justanotherguy May 02 '22 at 15:39
  • Googling "Microsoft Hotspot Java" leads to "Microsoft Build of OpenJDK". Amazon Corretto gives the same huge slowdown. (Running a drawn game of tictactoe.b: run toCharArray once, 573 milliseconds. Run it once per command, computing the length only once: 28136 milliseconds. Run it twice per command (once to get the length for the loop condition and once to get the character): 55228 milliseconds.) – Daniel Cristofani May 04 '22 at 12:28
  • 1
    Then, poking around more, I realized it's even better to run it zero times: pull code straight into a byte array with readAllBytes. – Daniel Cristofani May 04 '22 at 12:34