Understanding the magic number 0x07EFEFEFF used for strlen optimization

Question

I stumbled upon this answer regarding the utilization of the magic number 0x07EFEFEFF used for strlen's optimization, and here is what the top answer says:

Look at the magic bits. Bits number 16, 24 and 31 are 1. 8th bit is 0.

8th bit represents the first byte. If the first byte is not zero, 8th bit becomes 1 at this point. Otherwise it's 0.

16th bit represents the second byte. Same logic.

24th bit represents the third byte.

31th bit represents the fourth byte.

However, if I calculate result = ((a + magic) ^ ~a) & ~magic with a = 0x100, I find that result = 0x81010100, meaning that according to the top answerer, the second byte of a equals 0, which is obviously false.

What am I missing?

Thanks!

I don't believe the result actually tells you which byte is zero. It only tells you if at least one byte is zero or not. — Ross Ridge, Jun 19 '16 at 00:10

score 5 · Accepted Answer · answered Jun 19 '16 at 00:25

The bits only tell you if a byte is zero if the lower bytes are non-zero -- so it can only tell you the FIRST 0 byte, but not about bytes after the first 0.

bit8=1 means first byte is zero. Other bytes, unknown
bit8=0 means first byte is non-zero
bit8=0 & bit16=1 means second byte is zero, higher bytes unknown
bit8=0 & bit16=0 mans first two bytes are non-zero.

Also, the last bit (bit31) only tells you about 7 bits of the last byte (and only if the first 3 bytes are non-zero) -- if it is the only bit set then the last byte is 0 or 128 (and the rest are non-zero).

Understanding the magic number 0x07EFEFEFF used for strlen optimization

1 Answers1