I have a program whose basic structure is as below :
<c language headers>
main() {
some malloc() allocations and file reads into these buffers
call to an assembly language routine that needs to be optimized to the maximum
write back the output of to files and do free()
exit()
}
The assembly language program essentially computes the checksum of the data in the buffer, and my intention is to optimize it to the absolute maximum. It does not make any system calls, or any library function calls.
I have just installed Intel vTune Amplifier XE suite into VS 2015.
How do I specify to the vtune to focus strictly on the assembly language routine part, and skip all the analysis on the "C" language preparatory parts. I seem to be getting all the data cumulated, like INSTRUCTION COUNT, or CPI, etc. Is it possible to get the data only for the loops and the branches within the assembly language subroutine. If so, please advise how I could do that.
Thanks