2

I have a problem parsing lua bytecode generated using luaJ. Between the instruction count and the constant count something gets wrong. It seems like there is a byte missing. I'm using LuaJ 2.0.3.


Here is a hexdump that shows what I mean: hexdump

the bytecode was generated using

string.dump(function() return "athin" end)



The Constant Count shows 250 constants, but there should be only one. If there was 1 byte more between the constant count and the instruction list, it would work perfectly:

The constant count would be 1, and the type of the first constant 4 (string), the string would have a length of 6, including a null at the end.


Why is that not working? Why is there a byte missing? What do I have to do to fix this?

ardera
  • 51
  • 1
  • 4
  • What is your question? – dualed Jun 01 '14 at 18:13
  • How is it possible that there are 250 constants? If it is like I said, it should be 00 00 00 01 for the constant counter but then, there would be 11 bytes for the instructions, and that doesn't work. How do I get this working? Why is there 1 byte missing? – ardera Jun 02 '14 at 10:38

1 Answers1

2

Note: I posted this on the CC forums here first.

You are, in fact, missing an 0x00 byte. As the "Instructions", you have 00 00 00 01 01 00 00 1E 00 00 1E 00

Looking at A No-Frills Introduction to Lua 5.1 VM Instructions, that translates to:

LOADK 0 0 -- Load constant at index 0 into register number 0.
RETURN 0 2 -- Return 1 value, starting at register number 0.
MOVE 120 0 -- Copy the value of register number 120 into register number 0.

That last one doesn't make any sense. Why would the bytecode generator insert such a ridiculous instruction that will never be executed?

If you add one 0x00 byte to the last instruction, it reads as, 00 00 00 01 01 00 00 1E 00 00 00 1E.

That translates to:

LOADK 0 0 -- Load constant at index 0 into register number 0.
RETURN 0 2 -- Return 1 value, starting at register number 0.
RETURN 0 0 -- Return all values from register number 0 to the top of the stack.

If you read the PDF, you will find that the bytecode generator always adds a return statement to the end of the bytecode, even if there's already an explicit return statement in the Lua source. Therefore, this disassembly makes sense.

Anyway, if you add an extra 0x00 byte there, it shifts the rest of the bytecode over so it makes sense, like you said. It's just that the missing 0x00 byte isn't between "Instructions" and "Number of Constants", it's part of an instruction.

Now, I have no idea how this could be useful to you, since the output is directly from CC (or LuaJ), but that's the problem.

Note: After modifying ChunkSpy to accept big-endian chunks, it errored on the bytecode as you posted it, but worked fine with the bytecode if modified either the way you suggested it, or I suggested it.

AgentE382
  • 21
  • 1
  • Thanks for the answer, just posted this for some learned users here. Replied you on the CC forums. – ardera Jun 08 '14 at 08:43