0

Let's say I'm writing a virtual machine. I read in the program data into an array of bytes. Now I need to loop through those bytes (instructions are two bytes) and instantiate a little class representing each instruction and it's arguments.

What would be a fast parsing approach? Here are the two way's I've thought of:

  1. Logically branching by inspecting each bit from the left to the right until I narrowed it down to a particular op code. This would be like a binary search.
  2. Inspecting some programs to come up with a list of opcodes ordered by frequency of use, and then checking the for the full opcode in that order.

Note: I will be using bit shifting and masking in C to check, not regexes or string comps or anything high-level like that.

Seki
  • 11,135
  • 7
  • 46
  • 70
Josh Pearce
  • 3,399
  • 1
  • 23
  • 24
  • 1
    Often the opcodes have a nice pattern of several small "fields". For example, a field identifying a particular "group" of opcodes that have the same structure of fields, followed by fields following the structure specific to that group (indicating the specific operation and operands etc). There might be sub-groups too. Usually you can use that to make a few nested `switch`es without getting a huge overload of cases. Do you have a link to a description of this VM? – harold Jun 14 '13 at 11:42
  • AVR. Halfway down the page, Izotech lays them out nicely. http://www.avrfreaks.net/index.php?name=PNphpBB2&file=printview&t=30020&start=0 – Josh Pearce Jun 14 '13 at 11:46
  • Ok, that's not too bad, "decoding by fields" should work nicely on that – harold Jun 14 '13 at 11:48

1 Answers1

2

You don't need to parse anything. If this is in C, you make a table of function pointers which has 256 entries in it, one for each possible byte value, then jump to the appropriate function based on the first byte value. If the second byte is significant then a switch statement can be used within the function to handle the second byte. This is how the original Visual Basic interpreter (versions 1-6) worked.

Tyler Durden
  • 11,156
  • 9
  • 64
  • 126