I have seen several examples of bytecodes printed out in VSCode. I would like to know why these codes are customarily 5 to 6 bytes long and prepended by 0x. And if possible, can you tell me where I can find literature to answer such a question.
-
https://medium.com/dailyjs/understanding-v8s-bytecode-317d46c94775 could be useful. In particular, it links to https://github.com/v8/v8/blob/master/src/interpreter/bytecodes.h which is neat. – CollinD Apr 12 '22 at 20:00
-
3What makes you think that "most bytecodes are 5 or 6 bytes long"? Can you give an example of output you've seen that you'd like to understand better? – jmrk Apr 12 '22 at 20:34
-
1Why would you expect them to have a different length? Did you find some design document that indicates they shouldn't be 5-6 bytes long? – Bergi Apr 12 '22 at 21:08
-
@Bergi: I would, in fact, expect the vast majority of bytecodes to be shorter than 5 bytes, that's why it would be great to know more about what prompted this question. – jmrk Apr 12 '22 at 22:10
-
@jmrk True. The medium article linked by Collin, which also seems to have been read by the OP, contains some 8-byte hex numbers, but those are the addresses of the bytecodes not the bytecodes themselves (and indicate lengths of 1, 2, 4 bytes). – Bergi Apr 12 '22 at 22:15
-
a similar question is why does the x86 processors have instructions of different length, because some instruction don't need arguments and if often used might fit in 1 byte, result not much space wasted and more code fits the cache – rioV8 Apr 12 '22 at 22:27
-
@rioV8 Off-topic here, but variable-length instructions [are harder to decode](https://stackoverflow.com/q/8204086/1048572). That said, [x86 actually **is** a variable-length ISA](https://en.wikipedia.org/wiki/X86#Basic_properties_of_the_architecture) – Bergi Apr 12 '22 at 22:34
1 Answers
(V8 developer here.)
This appears to be a misunderstanding. Let's look at the example in the article linked in comments:
$ node --print-bytecode incrementX.js
...
[generating bytecode for function: incrementX]
Parameter count 2
Frame size 8
12 E> 0x2ddf8802cf6e @ StackCheck
19 S> 0x2ddf8802cf6f @ LdaSmi [1]
0x2ddf8802cf71 @ Star r0
34 E> 0x2ddf8802cf73 @ LdaNamedProperty a0, [0], [4]
28 E> 0x2ddf8802cf77 @ Add r0, [6]
36 S> 0x2ddf8802cf7a @ Return
Constant pool (size = 1)
0x2ddf8802cf21: [FixedArray] in OldSpace
- map = 0x2ddfb2d02309 <Map(HOLEY_ELEMENTS)>
- length: 1
0: 0x2ddf8db91611 <String[1]: x>
Handler Table (size = 16)
Here, 0x2ddf8802cf6e
is not a bytecode; it's the address in memory where the function's first bytecode (StackCheck
) is stored. The binary representation of this bytecode is not included in this printout, but we can deduce its size: from the difference to the second bytecode, whose address ends in 0x...f
, we can see that StackCheck
takes one byte in memory. The second bytecode in this example, LdaSmi
, takes two bytes (0x...71 - 0x...6f == 2
).
The 0x
prefix simply indicates to human readers that the number is presented in hexadecimal format.
You typically see 12 hexadecimal digits in memory addresses because today's so-called "64-bit" platforms (usually) have 48 bits of virtual address space, and each hex digit can represent 4 bits. You can imagine four leading 0
digits for the full 64-bit pointer: 0x00002ddf8802cf6e
.
(Technically the pointers are sign-extended, so if the 48 relevant bits started with a 1-bit, you'd have four leading f
digits. In my experience so far, this usually doesn't happen in practice.)

- 34,271
- 7
- 59
- 74
-
@BurtPaulie Pointers are indeed numbers, they identify positions in memory, measured in bytes. So the pointer `1003` is exactly 3 bytes behind the pointer `1000`. When you know that objects are packed densely, and you know that object A starts at 1003 and the next object B starts at 1007, then you also know that A is 1007-1003=4 bytes long. The full details of modern systems are fairly complicated, but the basic idea is that memory is an array of bytes, and pointers (aka addresses) are indexes into that array. – jmrk Apr 13 '22 at 20:56