Questions tagged [intel-vtune]

Use this tag to ask questions about Intel® VTune™ Profiler, which is an advanced performance profiler to find and optimize performance bottlenecks across CPU, GPU, and FPGA systems.

182 questions
0
votes
1 answer

How does thread waiting affect the execution time of the program?

In my C++ program, I am using boost libraries for parallel programming. Several threads are made to join() on other threads in a part of the program. The program runs pretty slow for some inputs... In an attempt to improve my program, I tried…
progammer
  • 1,951
  • 11
  • 28
  • 50
0
votes
1 answer

How to collect hardware events of ArangoDB with profiling tool

On a Ubuntu server 14.04 (4.4.0-62-generic) on Intel Xeon CPU E5-2698 v4, I am trying to collect hardware event counts for ArangoDB with Intel VTune. But if I start collecting, the server will die right away. I think the reason is that ArangoDB is…
0
votes
1 answer

Can I still profile my code when the load exceeds the cores?

Sometimes I need to profile an application while simultaneously needing to fire off a large number of unrelated calculations. Often I will launch off multiple jobs so that the load exceeds the number of cores so that I can just come back sometime…
EMiller
  • 2,792
  • 4
  • 34
  • 55
0
votes
0 answers

Intel VTune CPU OpenCL Command Queue

I can view the Intel HD Graphics Command Queue with VTune, but I cannot the CPU Command Queue. Why? It is the expected behavior, to only capture GPU "events" but not those from the CPU that are independent of the GPU? The same OpenCL program (a…
user3819881
  • 377
  • 3
  • 13
0
votes
0 answers

MPI4py profiling with VTune

I have an MPI python application and I try to profile it using VTune. Since I am running my application on a HPC, I am obliged to use a terminal. I tried several times and I am getting the following error: amplxe: Error: Failed to attach to the…
neiron21
  • 71
  • 5
0
votes
1 answer

How should I interpreter these VTune results?

I'm trying to parallelyzing this code using OpenMP. OpenCV (built using IPP for best efficiency) is used as external library. I'm having problems unbalanced CPU usage in parallel fors, but it seems that there is no load imbalance. As you will see,…
justHelloWorld
  • 6,478
  • 8
  • 58
  • 138
0
votes
2 answers

Difficulties in understand assmbly code of '__atomic_compare_exchange'

I program in C++ and use CAS operation for thread synchronization. I profiled my program by using Vtune and found that a huge portion of time was spent on CAS operation. I took a look at the assembly code. The profiling result shows that the…
syko
  • 3,477
  • 5
  • 28
  • 51
0
votes
0 answers

Cannot locate debugging symbols and a lot of idle CPU usage

I'm new to VTune Amplifier and I'm trying to profile OpenCV with a very basic application. Following this guide on recommended compiler options, I compiled OpenCV via CMake with CMAKE_BUILD_TYPE=RelWithDebInfo and -DWITH_OPENMP=ON so both -O2 and -g…
justHelloWorld
  • 6,478
  • 8
  • 58
  • 138
0
votes
1 answer

Error in comparing two Intel VTune Amplfier analysis?

I'm following this video tutorial (from Linux) about VTune Amplifier and I've followed everything, but when he compares the two basic analysis there is this error: How can I solve this?
0
votes
0 answers

Intel VTune Results Understanding - Naive Questions

My application I want to speedup performs element-wise processing of large array (about 1e8 elements). ​The processing procedure for each element is very simple and I suspect that bottleneck could be not CPU but DRAM bandwidth. ​So I decided to…
0
votes
1 answer

Multi-threaded performance issues

I have a multi-threaded programs. We use an own implementation of the thread pool. First, the load of the project is enough. compred to single thread, the program of two threads is more faster. When we increase the number of threads, greater than 2,…
ballontt
  • 11
  • 1
0
votes
1 answer

Profiling OpenCL application on Windows with NVIDIA GPU

can you help me? I'm developing OpenCL application on windows 7 x64. Hardware is Intel Core i5, NVIDIA GTX 770. OpenCL uses NVIDIA for acceleration. If I'm trying to use Intel VTune Amplifier XE 2015 my application hangs on the end of profiling and…
Mike
  • 43
  • 1
  • 5
0
votes
2 answers

system profiling - usage information of shared libraries

Is there any way to know which library files are being used by which process (or by how many number of process) in some amount of time. Can V-Tune or perf or OProfile be used for this?
Arjun Bora
  • 439
  • 2
  • 8
  • 20
0
votes
1 answer

How to measure Windows API code coverage of app level benchmarks

My job involves system-level performance testing with third party tools that I do not have sources for. I'm also testing Windows, and can use debugging symbols but not Windows source code. I'd like a quantitative way to describe the areas of the…
Aaron Altman
  • 1,705
  • 1
  • 14
  • 22
0
votes
1 answer

Profiling with Intel Vtune Amplifier

I have create one filter dll using some static libs and this dll is used in graph studio and it's running fine. But I have to do profiling of my dll, so I have started graph studio then vtune. In vtune project property I have attached it to process…
Mohan
  • 1,871
  • 21
  • 34