I think, what you need is automatic instrumentation and/or profiling. GCC can actually do profile-guided optimization for you. As well as other types of instrumentation, the documentation even mentions a hook for implementing your own custom instrumentation.
There are several performance analysis tools out there such as perf
and gprof
profilers.
Also, execution inside a virtual machine could (at least in theory) do what you are after. valgrind
comes to mind. I think, valgrind actually knows about all memory accesses. I'd look for ways to obtain this informaiton (and then corellate that with the map files).
I don't know if any of the above tools solves exactly your problem, but you definitely could use, say, perf (if it's available for your platform) to see in what areas of code significant time is spent. Then probably there are either a lot of expensive memory accesses, or just intensive computations, you can figure out which is the case by staring at the code.
Note that the compiler already allocates frequently accessed variables to registers, so the kind of information you are after won't give you an accurate picture. I.e. while some variable might be accessed a lot, cache-allocating it might not improve things much if its value already lives on a register most of the time.
Also consider that optimization affects your program greatly on the assembly level. So any performance statistics such as memory accesses counters would be different with and without optimization. And what should be of interest to you is the optimized case. On the other hand, restoring the information about what location corresponds to which variable is harder with the optimized program if feasible at all.