1

I'm developing a taint analysis tool using PIN.

And I have a question, How to get operands of lea instruction?

About lea instruction like lea rdx, ptr [rip+0x2244aa], getting the first operand using INS_OperandReg(ins, 0) is possible.

But I want to check if the second operand is tainted, but I cannot get it. And I couldn't find any function that returns the value of rip+0x2244aa

Is there any function that I can get the memory address(second operand) in lea?

xiawi
  • 1,772
  • 4
  • 19
  • 21
blbi
  • 11
  • 1
  • I haven't tested (so I can't guarantee it's correct) but my best guess would be to use `INS_IsLea()` to check if it's a `LEA`, then test for a RIP relative using a combination of `INS_OperandMemoryBaseReg()` (and test for `REG_INST_PTR` which would indicate a RIP relative addressing) and `INS_OperandMemoryDisplacement`to get the Disp32; once you have that, get the RIP of the current instruction (+ its size) and just add them to the displacement to get the "real" value that would be loaded into the destination register. – Neitsa May 08 '20 at 07:08

1 Answers1

1

To access the operands used in a typical lea instruction, which has the form:

EffectiveAddress =  base + (index * scale) + displacement

So you have to access each individual values,

if(INS_OperandMemoryIndexReg(ins,1) != REG_INVALID_) 
  { /* get index register */
   INT64 scale = INS_OperandMemoryScale (ins, I);//value of scale: 1, 2, 4, 8.
  }
if(INS_OperandMemoryBaseReg(ins, 1) != REG_INVALID_) 
  { /* get base register */}
if(INS_OperandMemoryDisplacement (ins, 1) != 0)
  { /* get displacement value */}   

Finally, you will know what registers are used or even calculate the actual values of the given effective memory address by calculating them up, (if you got their values by PIN_GetContextRegval(...) ).

Mos Moh
  • 317
  • 3
  • 15
  • 1
    The segment_base is not part of the "effective address"; LEA just does the offset part of the calculation. (Also the segment register *value* isn't the same as the base. But anyway that doesn't matter because LEA doesn't touch segmentation at all, in any mode.) – Peter Cordes Feb 22 '21 at 07:45
  • @PeterCordes I modified it. Thanks. – Mos Moh Feb 22 '21 at 10:04
  • There's a corner case that will never come up in normal code because it's inefficient and pointless, but is possible if someone is trying to fool a PIN tool: `lea rax, [edi + edx]` or similar (REX.W and 32-bit address size overrides). The effective-address calculation will be truncated to 32-bit even though the destination register is 64. So if you want to know the operand values, you might also need to check the address-size if you want to correct address (even if PIN register numbers already distinguish 32 vs. 64-bit registers.) – Peter Cordes Feb 22 '21 at 10:55
  • @PeterCordes Can this case be generated in binary files by ordinary compilers to be fed further to the PIN tool for instrumentation, or generate by Intel Pin JIT? – Mos Moh Feb 22 '21 at 14:35
  • 1
    No sane compiler will ever generate that, and hopefully even novice humans would at worst do `lea eax, [edi+edx]` (instead of the optimal `lea eax, [rdi+rdx]`). If your PIN tool is trying to check / enforce something about code running under it, 32-bit address-size overrides might be a way for malicious code to fool it, but it's not something you normally have to worry about. (For non-LEA instruction, `gcc -mx32` will often use 32-bit address size for loads / stores to make sure address math doesn't overflow the low 32 bits, e.g. with a negative int that's not zero-extended to 64-bit.) – Peter Cordes Feb 22 '21 at 23:59