0

Is there any static tool to analyze assembly/objdump and detect instructions that access the same memory location?

For instance, consider the following C code, where functions main and f access the same object on heap pointed to by the heapobj pointer.

void f(int *p) {
*p = *(p) + 1;
}

int main() {
int *heapobj;
heapobj = (int *)malloc(sizeof(int));
*heapobj = 666;

f(heapobj);

return 0;
}

And here is the object dump for the above code, where instruction #401171 by main writes to the heap location pointed to by heapobj, and instructions #40113c and #401145 by f read and write from/to the same heap location respectively.

I need a static tool that can look at the objdump/assembly and tell me:

"Hey! Instructions #401171, #40113c, and #401145 access the same memory location!"

Any suggestion is greatly appreciated, including a possible tool that works only for heap/stack objects.

0000000000401130 <f>:
401130: 55                      push   %rbp
401131: 48 89 e5                mov    %rsp,%rbp
401134: 48 89 7d f8             mov    %rdi,-0x8(%rbp)
401138: 48 8b 45 f8             mov    -0x8(%rbp),%rax
40113c: 8b 08                   mov    (%rax),%ecx     <<<<<
40113e: 83 c1 01                add    $0x1,%ecx
401141: 48 8b 45 f8             mov    -0x8(%rbp),%rax
401145: 89 08                   mov    %ecx,(%rax)     <<<<<
401147: 5d                      pop    %rbp
401148: c3                      retq   
401149: 0f 1f 80 00 00 00 00    nopl   0x0(%rax)

0000000000401150 <main>:
401150: 55                      push   %rbp
401151: 48 89 e5                mov    %rsp,%rbp
401154: 48 83 ec 10             sub    $0x10,%rsp
401158: c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
40115f: bf 04 00 00 00          mov    $0x4,%edi
401164: e8 c7 fe ff ff          callq  401030 <malloc@plt>
401169: 48 89 45 f0             mov    %rax,-0x10(%rbp)
40116d: 48 8b 45 f0             mov    -0x10(%rbp),%rax
401171: c7 00 9a 02 00 00       movl   $0x29a,(%rax)       <<<<<
401177: 48 8b 7d f0             mov    -0x10(%rbp),%rdi
40117b: e8 b0 ff ff ff          callq  401130 <f>
401180: 31 c0                   xor    %eax,%eax
401182: 48 83 c4 10             add    $0x10,%rsp
401186: 5d                      pop    %rbp
401187: c3                      retq   
401188: 0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
40118f: 00

Instruction #

Farzam
  • 131
  • 2
  • 13
  • 4
    Static analyzer can't know it in the general case. – Eugene Sh. Apr 13 '22 at 21:35
  • 1
    @EugeneSh. What would be the limitation for doing such analysis statically? I though it would be possible by a thorough analysis. And do you know any dynamic tool that can do so? – Farzam Apr 13 '22 at 21:45
  • 3
    In the general case, a static tool cannot determine the actual execution path aka call chain and flow through any given call, in part because some of that is determined by input to the program, to which a static tool wouldn't have access. Thus in the general case, since a static tool cannot analyze the flow of the program, it will not be able to analyze all memory usages for such [aliasing](https://en.wikipedia.org/wiki/Aliasing_(computing)). Also see Type Base Alias Analysis – Erik Eidt Apr 13 '22 at 23:20
  • You could use a debugger | trainer that logs all reads or writes to a specific memory location. – rcgldr Apr 13 '22 at 23:43
  • in general not possible as stated. you would have to analyze the binary through all the code execution paths and more importantly for all inputs. user input, files, say an image manipulation program. are you going to do an analysis on every image ever created or that will ever be created to insure every code path? you can find some, but expect not to find all. Same goes for the just run it approach. then there is the build, optimizations, build for debug or release, are different binaries that touch different addresses for the same high level code. – old_timer Apr 14 '22 at 00:01
  • so you have to manage your expectations and if that is adequate then sure. – old_timer Apr 14 '22 at 00:02
  • Further, the static analysis tool may be able to use alias analysis to identify potential accesses to the same memory location, but in the general case, static analysis can result in many false positives that don't actually happen at runtime. Compilers routinely do this kind of conservative static analysis and use that to inform potential applicability/inapplicability for the various optimizations they perform. – Erik Eidt Apr 14 '22 at 18:36

0 Answers0