Use this tag to ask questions about Intel® VTune™ Profiler, which is an advanced performance profiler to find and optimize performance bottlenecks across CPU, GPU, and FPGA systems.
Questions tagged [intel-vtune]
182 questions
2
votes
0 answers
Intel vtune takes lots of time to collect information
When I am using vtune to collect information of a process I only need to focus on the result of one particular dll(lets say X.dll). But When I finished with running the process and in the collecting information stage, one dll(lets say Y.dll) will…

amilamad
- 470
- 6
- 9
2
votes
0 answers
Big difference between Elapsed Time and CPU Time
VTune hotspots analysis reports my program's execution time (elapsed time) was 60 seconds out of which only 10 seconds are reported as "CPU Time". I'm trying to where the remaining 50 seconds was spent. Using Windows Process Monitor's File System…

DigitalEye
- 1,456
- 3
- 17
- 26
2
votes
2 answers
How to disassemble a compiler generated code?
I would like to see the disassembled code in the same order that the compiler generates after instruction rescheduling. b.t.w I am using GDB and when I give a command saying disas /m FunctionName it gives me disassembled code in the order of source…

quantumshiv
- 97
- 10
2
votes
1 answer
is it possible to do multiple runs in Intel VTune Amplifier XE
Is there a way to run same test(for example Lightweight Hotspots) multiple times in Intel VTune Amplifier XE ??? It is annoying to do multiple clicks to perform a single test. I have looked though documentation, but found nothing.
Thanks !

newprint
- 6,936
- 13
- 67
- 109
2
votes
1 answer
Understanding VTune report
this is a followup to an existing thread (http://stackoverflow.com/questions/12724887/caching-in-a-high-performance-financial-application) - I found that it's not the cache that hinders my application. To cut the long story short, I have an…

Daniel Bencik
- 959
- 1
- 8
- 32
2
votes
2 answers
Is it possible to use vtune on certain code snippets in a binary and not an entire binary?
I am adding usage of a small library to a large existing piece of software and would like to analyze (in finder detail than just in&out rdtsc() or gettimeofday calls) the overhead and it's attribution of the small library. Using things like rdtsc()…

Palace Chan
- 8,845
- 11
- 41
- 93
2
votes
1 answer
Is there any way to make the Time Profiler Instrument more effective with large functions?
I'm currently using Xcode's Time Profiler Instrument for iOS. One function is extremely large. Yes, splitting it up into much smaller inline ones would be far more intelligent. However, is there a way to fake stack levels or get the instrument to…

Mike Weir
- 3,094
- 1
- 30
- 46
1
vote
1 answer
Does memory get allocated with Fortran read() with no I/O list?
I am reading an ascii file with Intel Fortran opened as:
open(10, file=trim(file_name), status='old', action='read', iostat=ierr, iomsg=msg)
To skip some file lines which I do not want to store I am using read() with no I/O list:
read(10, *)
VTune…

Vitaliy
- 75
- 1
- 8
1
vote
0 answers
Getting executable to run in a separate terminal while profiling in Ubuntu Linux
I have the following code:
#include
int main() {
for (size_t i = 0; i < 100000; i++)
{
printf("%zu ", i);
if(i == 10)
getchar();
}
}
When this executable is built and profiled in Windows with…

Tryer
- 3,580
- 1
- 26
- 49
1
vote
1 answer
Vtune H/W Event-Based Sampling no call stack information
Same program, same environment, When I use User-Mode Sampling, I got this result with callstack info
But when I use H/W Event-Based Sampling, I got the result like this
Vtune Binary/Symbol Search setting is same in both mode
Is defualt H/W…

charlesJKing
- 21
- 3
1
vote
1 answer
vTune 2022.2 fails to load PDB symbols for .NET 6 console application
I am attempting to use Intel vTune to profile a .NET 6 Console application. I am following the example from the Intel website.
You can find the repo here.
I have vTune 2022.2 installed and I running on Windows 10 Pro Version 21H2 Build 19044.1706
I…

Matthew Crews
- 4,105
- 7
- 33
- 57
1
vote
1 answer
Intel OneAPI setvarsh.sh not set pernamently (Ubuntu)
I am struggling with the usage of intel OneAPI, specifically compiler (DPC++/C++) and Vtune Profiler.
I've installed everything successfully, used source setvarsh.sh in the installation directory and everything worked fine until I closed the…

Dr. Ske
- 69
- 6
1
vote
1 answer
Effective total time for a callee function is higher than that of caller function in intel-vtune
I have a multi-threading application and when I run vtune-profiler on it, under the caller/callee tab, I see that the callee function's CPU Time: Total - Effective Time is larger than caller function's CPU Time: Total - Effective Time.
eg.
caller…

yashC
- 887
- 7
- 20
1
vote
1 answer
VTune did not launch application correctly
When profiling remotely, Intel VTune seems can't start application correctly.
I configure my target as a .sh script my vtune launch app config.
And the amplex-python shows that The script successful launch, but not the app. why?amplex-python…

TCJ
- 11
- 1
1
vote
0 answers
Perf metric / event to estimate row-buffer locality
I'm aware that there are hardware cache events and there are also some events that give the number of large latency memory requests (gt 16, 32, 64, 128 cycles).
I wanted to know if it makes sense to use any such metric to estimate/guess the…

Harsh Kumar
- 97
- 6