I want to use Vtune Profiler APIs to profile a code running on Xeon Phi (Linux, using offload execution) to see the number of instructions executed, the number of L1 cache misses, etc. But I can't find anywhere explaining how to use this library.
Where to find the library files and include files in Linux? How do I write a code to profile a short code running on Xeon Phi?
I would expect something like this:
//this code will be executed on host processor
Read_counters();
Code_to_run on Xeon Phi
Stop_counters();
Print results();
Thanks