How to simplify Assembly Translation Shift Right by 32 Xor Absolute Number And Value

Question

I don't know the original code but I don't believe it's this complicated with right shift's and abs.

Here is how the decompiled IDA PRO code renamed looks like

char Ship; //Could be 0-7 (8 is reversed for special purpose)
char NewShip = 1; //Could be 0-7  (8 is reversed for special purpose)
short Frequency = 0; //This could be from 0 to 9999
bool NumberToFrequency = true;

Frequency = GetNextFrequencyToJoin(player->MyArena);
if ( NumberToFrequency )
{ //TODO: maybe the below is just Frequency % 7; ?
  NewShip = (((unsigned long)Frequency >> 32) ^ abs(Frequency) & 7) - ((unsigned long)Frequency >> 32);
  Ship = NewShip;
} else {
  Ship = NewShip;
}

Here is a IDEOne test http://ideone.com/Q2bEjU

Seems NewShip = abs(frequency) & 7); is all I really need it seems I tested all the possibilities by looping it never screws up.

Another decompiler gave me this result

 asm("cdq ");
 NewShip = ((Var1 ^ Var2) - Var2 & 7 ^ Var2) - Var2;

Which has no right shifts or anything still looks alien to me, probably shows how absolute number works and still no clue where right shift 32's came from.

Whats the NumberToFrequency suppose to do is make the frequency the same as the ship but of course the frequency goes past 7 so the remaining values should still translate to Ship values so I assume it's just a modulus % of 7.

But why such a complicated code maybe it means something completely different? I'm just asking what the code means. I'll add the assembly code below. I can't even find the Shift right 32 in the assembly below I'm pretty sure it's in same place.

.text:0040DD3A                 mov     ecx, [ebp+1Ch]  ; arena
.text:0040DD3D                 call    GetNextFrequencyToJoin
.text:0040DD42                 mov     ecx, [ebp+1Ch]
.text:0040DD45                 mov     si, ax
.text:0040DD48                 mov     [esp+220h+var_20C], si
.text:0040DD4D                 cmp     [ecx+1ACCEh], ebx
.text:0040DD53                 jz      short loc_40DD98
.text:0040DD55                 movsx   eax, si
.text:0040DD58                 cdq
.text:0040DD59                 xor     eax, edx
.text:0040DD5B                 sub     eax, edx
.text:0040DD5D                 and     eax, 7
.text:0040DD60                 xor     eax, edx
.text:0040DD62                 sub     eax, edx
.text:0040DD64                 mov     [esp+220h+var_20F], al

EDIT: I found the answer on my own seems those shift 32 >> 32 are useless garbage added for some old C compiling support with types to match 32 bit DWORD's or some crap like that.

Side note: [ida] is the preferred tag ([ida] and [ida-pro] should be synonyms; see: http://meta.stackoverflow.com/questions/308568/pro-ida-ida-pro) — user1354557, Jun 22 '16 at 13:52

example · Accepted Answer · 2014-04-26T12:11:03.147

6

The shifts are not useless. It is a form of banchless logic that Hexray did not manage to reproduce in its c disassembly.

.text:0040DD55                 movsx   eax, si
.text:0040DD58                 cdq
.text:0040DD59                 xor     eax, edx
.text:0040DD5B                 sub     eax, edx
.text:0040DD5D                 and     eax, 7
.text:0040DD60                 xor     eax, edx
.text:0040DD62                 sub     eax, edx

Is the significant code. EDX:EAX is the sign-extended version of SI, so EDX is either 0 or -1. The xor either leaves eax untouched or inverts it, the sub leaves it untouched or adds one and so on in total:

if (si < 0) {
    eax = ~si;
    eax += 1;
    eax &= 0x7;
    eax = ~eax;
    eax += 1;
} else {
    eax = si & 0x7;
}

The first branch can still be simplified, but I leave that to you...

update

That the branches only differ for si<0 already gives a hint at what is happening. The sequence eax = ~si; eax += 1; can be understood as the two's-complement, so inserting our knowledge of this complement we get

if (si < 0) {
    eax = -1 * si;
    eax &= 0x7;
    eax *= -1;
} else {
    eax = si & 0x7;
}

or in short

eax = (abs(si) & 0x7) * sign(si);

Or with the signed modulus operator

al = si % 8;

edited Apr 26 '14 at 12:11

answered Apr 26 '14 at 11:55

example

3,349
1
20
29

haha if can, can you please simplify it under it, I have no clue whats going on to be honest I would appreciate it, if it doesn't take too much of your time. Doubt a branch was even used here could it be the source code to abs function itself? – user3435580 Apr 26 '14 at 11:58
So does that mean it actually is `modulus` and not `&` getting used here? – user3435580 Apr 26 '14 at 12:06
@user3435580 sorry I didn't even see that. Yes, it is infact a modulus. `al = si % 8` using the signed `modulus` operator. – example Apr 26 '14 at 12:10
so that 0x1007 is a bitwise mask or something? never seen one like that before. What do you think it originally was as clean as it could be? `NewShip = abs(frequency) & 7);` seems to give the same exact results but then again it's based on what you say what hex-rays decompiled wrong. – user3435580 Apr 26 '14 at 12:10
@user3435580 sry, I was confused for a moment (with the 0x1007 mask). Should be correct now. – example Apr 26 '14 at 12:12
Modulus 8 makes sense in a way since the Ships could actually hit 8 possible ships but then again it shouldn't really be possible to hit the 8th value well 9th ship since that ship means you are not playing and why go into a frequency in this game when you are not playing doesn't make sense. You say signed modulus seems like the same operator you mean `% -8`? – user3435580 Apr 26 '14 at 12:13
@user3435580 with signed modulus I mean, that `si % 8` preserves the sign of `si`. (fairly likely as the standard c and c++ operators `%` preserve the sign since c99 and c++11) – example Apr 26 '14 at 12:14
ah so the `SI` is negative and the modulus will never actually evaluate to 8? – user3435580 Apr 26 '14 at 12:16
@user3435580 values in `al` are in principle -7,-6,...,6,7. Of course it is possible that `si` is always positive, in which case only 0 through 7 can be reached. – example Apr 26 '14 at 12:18
So I can't fix it, since I don't have access to `SI` in the C code and have to keep using `abs(val) & 7` not that I want to have access to it anyways I want it to be clean code for the most part. Also in my answer I found other places that use abs look similar even without the `& 7`. I'll actually test this feature inside the game and find out the true answer was it a `% 7` or a `& 7` and let you know haha Thanks for the help man – user3435580 Apr 26 '14 at 12:21
It's neither :( `& 7 vs % 7` with value `5555` `MOD=4 AND=3` inside the game I was ship `3` so AND was right so far then also tested `9999` should be ship `0` but it gave me `AND=7 MOD=3` I was thinking maybe 7 should of overflowed or something. I'll try it again. – user3435580 Apr 26 '14 at 12:41
Okay I can't test all 9999 possibilities in the game would take forever. But it seems only the last freq 9999 I assume breaks the rules which is probably somewhere else in the code. http://pastebin.com/J6dEJMHX – user3435580 Apr 26 '14 at 12:51
Okay false alarm you can't be on freq 9999 it already gets reverted back to 0 it's reserved for spectators. Thanks all is known now. Even though it's `SI % 8` it's really `Ship And 7` `Abs` is most likely not needed at all but used as a sanity check. – user3435580 Apr 26 '14 at 13:18

user3435580 · Answer 2 · 2014-04-26T11:53:50.130

I think I figured it out, the decompiler I used IDA-PRO seems to generate these Shift right 32's >> 32 all over the place and in all cases where I see this a abs() function is used it just seems like a useless wrapper for Absolute number Function.

Some examples I found.

//1
((((unsigned long)i >> 32) ^ abs(i)) - ((unsigned long)i >> 32))
//2
(((unsigned long)encryption->field_25E >> 32) ^ abs(encryption->field_25E)) - ((unsigned long)encryption->field_25E >> 32);
//3
((((unsigned long)i >> 32) ^ abs(i)) - ((unsigned long)i >> 32))
//4
(((unsigned long)(v104->field_A8 + 1) >> 32) ^ abs(*((unsigned char*)&(v104->field_A8)) + 1) & 7) - ((unsigned long)(v104->field_A8 + 1) >> 32);
//5
(((unsigned long)v11 >> 32) ^ abs(v11)) - ((unsigned long)v11 >> 32);
//6
(((unsigned long)v4->field_262 >> 32) ^ abs(v4->field_262)) - ((unsigned long)v4->field_262 >> 32)
//7
(((unsigned long)v18 >> 32) ^ abs(v18)) - ((unsigned long)v18 >> 32);
//8 (not refactored yet).
((((unsigned long)*(unsigned int *)(v1 + 610) >> 32) ^ abs(*(unsigned int *)(v1 + 610))) - ((unsigned long)*(unsigned int *)(v1 + 610) >> 32)

You may also see these >> 32's in 1 more place which I already know is just optimized division from researching and it looks much more different.

Something crazy like this (I fixed this with my regular expression tool)

(signed int)((unsigned int)v130 + ((unsigned long)(18446744071705233545i64 * (signed int)v130) >> 32)) >> 5;

//Originally it looked something like this
LODWORD(v202) = (signed int)((_DWORD)v202 + (0xFFFFFFFF88888889ui64 * (signed int)v202 >> 32)) >> 5;

//Or without the hexadecimal values
LODWORD(v202) = (signed int)((_DWORD)v202 + ((unsigned __int64)(18446744071705233545i64 * (signed int)v202) >> 32)) >> 5;

//You will see it getting used like this
(signed int)(((unsigned int)v202 >> 31) + v202)

But what it really means is
v202 / 60

The equations used to convert it back to / 60 is talked about on http://www.hexblog.com/?p=17

How to simplify Assembly Translation Shift Right by 32 Xor Absolute Number And Value

2 Answers2