1

This most likely has been asked and answered before, but my searches was futile.

Question is about bits, bytes masks and checking.

Say one have two "triggers" 0xC4 and 0xC5:

196: 1100 0100  0xc4
197: 1100 0101  0xc5

The simple way of checking if var is either would be:

if (var == 0xc5 || var == 0xc4) {

}

But sometimes one see this (or the like):

if ( ((var ^ magic) & mask) == 0)  {

}

My question is how to find magic and mask. What methods, procedures, tricks etc. is to be utilized to form these values and to assert if any exists?


EDIT:

To clarify. Yes, in this exact example the former would be better then the latter, but my question is more as in general of generating and checking these kinds of masks. Bit twiddling in general. I omitted a lot and tried to make the question simple. But ...

As an example I had a look at the source of OllyDbg decompiler source where one find:

if (((code ^ pd->code) & pd->mask) == 0) 
    FOUND

Where code is 0 - 3 bytes of command cast from instruction.

unsigned long code = 0;
if (size > 0) *(((char *)&code) + 0) = cmd[0];
if (size > 1) *(((char *)&code) + 1) = cmd[1];
if (size > 2) *(((char *)&code) + 2) = cmd[2];

As in masking against only bytes part of cmd

And pd is part of:

struct t_cmddata {
    uint32_t mask;          Mask for first 4 bytes of the command
    uint32_t code;          Compare masked bytes with this
        ...
}

holding a long array as:

const t_cmddata cmddata[] = {
/*      mask      code  */
  { 0x0000FF, 0x000090, 1,00,  NNN,NNN,NNN, C_CMD+0,        "NOP" },
  { 0x0000FE, 0x00008A, 1,WW,  REG,MRG,NNN, C_CMD+0,        "MOV" },
  { 0x0000F8, 0x000050, 1,00,  RCM,NNN,NNN, C_PSH+0,        "PUSH" },
  { 0x0000FE, 0x000088, 1,WW,  MRG,REG,NNN, C_CMD+0,        "MOV" },
  { 0x0000FF, 0x0000E8, 1,00,  JOW,NNN,NNN, C_CAL+0,        "CALL" },
  { 0x0000FD, 0x000068, 1,SS,  IMM,NNN,NNN, C_PSH+0,        "PUSH" },
  { 0x0000FF, 0x00008D, 1,00,  REG,MMA,NNN, C_CMD+0,        "LEA" },
  { 0x0000FF, 0x000074, 1,CC,  JOB,NNN,NNN, C_JMC+0,        "JE,JZ" },
  { 0x0000F8, 0x000058, 1,00,  RCM,NNN,NNN, C_POP+0,        "POP" },
  { 0x0038FC, 0x000080, 1,WS,  MRG,IMM,NNN, C_CMD+1,        "ADD" },
  { 0x0000FF, 0x000075, 1,CC,  JOB,NNN,NNN, C_JMC+0,        "JNZ,JNE" },
  { 0x0000FF, 0x0000EB, 1,00,  JOB,NNN,NNN, C_JMP+0,        "JMP" },
  { 0x0000FF, 0x0000E9, 1,00,  JOW,NNN,NNN, C_JMP+0,        "JMP" },
  { 0x0000FE, 0x000084, 1,WW,  MRG,REG,NNN, C_CMD+0,        "TEST" },
  { 0x0038FE, 0x0000C6, 1,WW,  MRG,IMM,NNN, C_CMD+1,        "MOV" },
  { 0x0000FE, 0x000032, 1,WW,  REG,MRG,NNN, C_CMD+0,        "XOR" },
  ...

That would be a typical live example of usage. So again: methods for this. Have been looking at Karnaugh map etc. – but thought there was other and so on method for the same district of operation.

Zimzalabim
  • 1,107
  • 1
  • 12
  • 22
  • 1
    "None" would be my quick answer, considering the huge difference in readability between the two! :) – unwind Jul 05 '13 at 10:51
  • Which bits are you interested in? Just check for those bits, no need to use any xor operation. – Some programmer dude Jul 05 '13 at 10:53
  • I doubt it was XOR what you saw. I rather think it was `&` instead, which is bitwise AND. –  Jul 05 '13 at 10:55
  • I tried to write an answer. But my brain isn't working right right now. :( – luser droog Jul 05 '13 at 11:08
  • @luserdroog: Thanks for the effort, unfortunately I didn't see it ... and for brain not working, I'm right there with you. I'm at a standstill. Whole week. In a period where logic simply does not comply. – Zimzalabim Jul 05 '13 at 11:12
  • I kept thinking I'd got it straight, and then quickly deleting when it didn't triple-check. :) – luser droog Jul 05 '13 at 11:13

2 Answers2

2

I assume your question is: given a set of "triggers", can we find a mask and magic that the triggers can be checked by the following code

if ( ((var ^ magic) & mask) == 0)  {
}

or it is the same as

if ((var & mask) == (magic & mask))  {
}

An example of "triggers" is like

196: 1100 0100  0xc4
197: 1100 0101  0xc5
204: 1100 1100  0xcc
205: 1100 1101  0xcd

If it is feasible, the bits of "triggers" should be classified into 2 types: "specific bits" and "arbitrary bits". Like the first 4 bits and the 6th and 7th bits, specific bits are the same in each trigger. If your change an arbitrary bit of an trigger, it's still an trigger.

So there are exactly 2^N triggers where N denotes the number of arbitrary bits.

This is my first answer on stackoverflow. I'm not sure if I understand your question correctly. Or are you asking other bit twiddling methods?

Vichare
  • 53
  • 3
  • Strictly speaking it is equivalent to `(var & mask) == (magic & mask)`, but of course one would choose magic so that `(magic & mask) == magic`. – starblue Jul 05 '13 at 18:16
1

Given your two values,

196: 1100 0100  0xc4
197: 1100 0101  0xc5

you'd want to mask-off the bits that differ, in this case bit 0. So the mask value would be the inverse of 0x01, 0xFE.

ie. 0xC4 & 0xFE == 0xC4, and 0xC5 & 0xFE == 0xC4.

That means both values become 0xC4. Then you can check for 0xC4 by xor-ing with the exact bit pattern that should remain.

     1100 0100  0xC4

ie. 0xC4 ^ 0xC4 == 0.

     1100 0100    1100 0101
   & 1111 1110    1111 1110 
     ---- ----    ---- ----
     1100 0100    1100 0100
   ^ 1100 0100
     ---- ----
     0000 0000

Mask first, or risk utter confusion.


Looking through the actual source file, I kinda think he is trying to be obfuscated. Many of the functions want factoring.

luser droog
  • 18,988
  • 3
  • 53
  • 105
  • Great. This is good. Do you know of any method for validating these kind of expressions (beside truth tables that quickly can become a bit vast) or coding it in a procedure ... which is kind of *"ok lets try this one"*. And any topic I should read up on for more quickly seeing these kind of patterns. (methods, books, math-topics, etc). – Zimzalabim Jul 05 '13 at 11:41
  • Not yet, but I'm inspired to try to find some. I'm taking a look at the decompiler code. The part you quoted that sets `code` is totally non-portable (assumes little-endian) and that's a little distracting. :) – luser droog Jul 05 '13 at 11:47