Questions tagged [intel-vtune]

Use this tag to ask questions about Intel® VTune™ Profiler, which is an advanced performance profiler to find and optimize performance bottlenecks across CPU, GPU, and FPGA systems.

182 questions
2
votes
1 answer

Intel VTune Profiler shows __mulq is a computationally expensive function in a fortran code

I'm trying to perform an audit on a rather complicated multi-physics model I'm working on and have been using Intel VTune Profiler to identify expensive subroutines. The most expensive function is a function called __mulq which is not something…
2
votes
1 answer

Intel Vtune Profiler: Remote Profiling with Sudo

I'm using Intel Vtune to profile a remote application that requires sudo access on another machine. I have been able to profile remote applications on that machine before that do not require sudo access, but Intel Vtune is not working for…
MUAS
  • 519
  • 1
  • 7
  • 20
2
votes
2 answers

amplxe-sepreg.exe missing from VTune

When I try to use hardware event-based profiling in VTune (Profiler 2020), I get the error message Cannot enable Hardware Event-based Sampling due to a problem with the driver (sep*/sepdrv*). Check that the driver is running and the driver group is…
me'
  • 494
  • 3
  • 14
2
votes
0 answers

Why is 'add' taking so long in my application?

I'm profiling an application using Intel VTune, and there is one particular hotspot where I'm copying a __m128i member variable in the copy constructor of a C++ class. VTune gives this breakdown: Instruction CPU Time: Total …
Thomas Johnson
  • 10,776
  • 18
  • 60
  • 98
2
votes
3 answers

How do I generate symbol information to use with Linux version of Intel's VTune Amplifier?

I am using Intel VTune Amplifier XE 2011 to analyze the performance of my program. I want to be able to view the source code in the analysis results, and the documentation says I need to provide the symbol information. Unfortunately, it does not…
Dylan Klomparens
  • 2,853
  • 7
  • 35
  • 52
2
votes
1 answer

OpenCV build in debug mode with optimizations?

I'm trying to profile OpenCV using Intel VTune Amplifier. In this page, there is a list of compiler options suggested to obtain the best analysis. As you can see, it's a mix of debug flags (e.g. -g) and optimization flags (e.g. -O2 or higher), so we…
justHelloWorld
  • 6,478
  • 8
  • 58
  • 138
2
votes
2 answers

Interpreting Intel VTune's Memory Bound Metric

I see the following when I run Intel VTune on my workload: Memory Bound 50.8% I read the Intel doc, which says (Intel doc): Memory Bound measures a fraction of slots where pipeline could be stalled due to demand load…
Frank
  • 4,341
  • 8
  • 41
  • 57
2
votes
1 answer

Profile C++ programm based on wall clock time with intel Vtune amplifier

I've just started using intel VTune Amplifier XE and looks like by default only cpu-time measures. Is it possible to setup VTune to get results based on wall clock time (real time)? Actually my goal is to get hotspots from disk I/O operations.
Narek Atayan
  • 1,479
  • 13
  • 27
2
votes
1 answer

How to restrict Vtune Analysis to a specific function

I have a program whose basic structure is as below : main() { some malloc() allocations and file reads into these buffers call to an assembly language routine that needs to be optimized to the maximum write back the…
quasar66
  • 555
  • 4
  • 14
2
votes
1 answer

difficult understanding memory address in Intel's vtune tool

In the above image, I have used vtune tool to see process's flow. Also dumped memory for windbg. I intend to see if that Engine.dll+840c1 disassembled section in windbg, but seems result is different. Can you guys tell what I'm doing wrong??
백경훈
  • 101
  • 4
2
votes
3 answers

vtune - no symbols available

I have used vtune several times in the past, usually without too much trouble. Unfortunately the gaps between each use are often so long that I forget some aspects of how to use it each time. I know that the line number and symbols information needs…
Mick
  • 8,284
  • 22
  • 81
  • 173
2
votes
1 answer

Vtune total time in MKL function

I am working on a university project that asks me to give a breakdown on some tridiagonal eigensolvers implemented in MKL (11.1.). So I implemented some testbed for that and now, I am trying to profile this in vtune (Intel VTune Amplifier XE 2013…
yomar
  • 21
  • 2
2
votes
1 answer

Performance Measurement - Get Average call time per function. Intel Vtune Amplifier

I'm simply trying to get the average time it takes each function to run. That means I want the: "Total time inside the function" / "Number of calls to the function" I'm getting all sorts of information when I run an analysis from within VTune. These…
ZivS
  • 2,094
  • 2
  • 27
  • 48
2
votes
1 answer

FLOP measurement

I'm trying to estimate FLOPS for my application using intel vtune Amplifier and I'm using this post here as a guideline : https://software.intel.com/en-us/articles/estimating-flops-using-event-based-sampling-ebs/ The problem is that I can't find the…
M_rr113
  • 29
  • 3
2
votes
1 answer

start_thread clone taking most of the time in parallel program - bad parallelization or wrong report?

I'm currently working on parallelizing a C++ program in order to improve its performance on multi-core systems. Using OpenMP and considering the challenges (thread synchronization, data accesses, etc) we finally found a way to make the entire…
leosh
  • 878
  • 9
  • 22
1 2
3
11 12