0

I am new to VTune Amplifier and I am developing an Android application on Intel Atom processor. When I try to run profiling I get the following error

I am using Intel Vtune Amplifier 2014 for Android Systems.

amplxe: Error: Cannot enable Hardware Event-based Sampling: problem with the driver (sep*/sepdrv*). Check that the driver is running and the driver group is in the current user group list. See "Building and Managing the Sampling Driver" help topic for further details.

I have tried to follow the steps provided on this link https://software.intel.com/en-us/forums/topic/372533#comment-1791207 and "Peter Wang (Intel)" comments on this thread.

I am running VTune Amplifier with eclipse on my windows.

I was not able to interrupt previous comments by Mr Wang for re installing the drivers from the thread provided in the link.

Can somebody please elaborate to resolve the issue.

NOTE: I am more of a windows guy, steps friendly with windows will be of great help to me.

Thanks in advance for any help in resolving this issue.

Harrisson
  • 255
  • 2
  • 21

1 Answers1

1

The error message indicates that you missing the required drivers for an Advanced Hotspots analysis. These drivers are required as the analysis is using a dedicated hardware inside the CPU called Performance Monitoring Unit (PMU). On production devices you can buy in a store it's unlikely to have these drivers preinstalled.

You have the following options to workaround this:

  1. Use the Basic Hotspots analysis. This works without special drivers on any Intel based Android device. If your device is not rooted you also need to configure the application in debug mode. There is an article available: Using Intel® VTune™ Amplifier 2014 for Systems on non-rooted Android* devices.
  2. If you are using a "Dell Venue 8" you can turn this device into a developer device. This basically means you flash it with a special firmware that contains the drivers for VTune and is also rooted by default. With such a configuration you can also do an Advanced Hotspots Analysis. Instructions can be found here: How to use Intel® VTune™ Amplifier 2014 for Systems on a Dell Venue 8
  3. You rebuild the required drivers for the Advanced Hotspots Analysis by yourself. This is only possible if you have a device with an open boot loader and the sources to rebuild the kernel. In general I wouldn't recommend this option until you are working for a device manufacture. If you are interested in this option let me know and I will add an explanation about it.

Out of curiosity: What is the device you are using? What kind of application are you trying to profile?

Alexander Weggerle
  • 1,881
  • 1
  • 11
  • 7
  • 3
    Thanks for the detailed reply. I am working on Video codec application. I have video application which gives a approx 200% gain on my Intel i7 from c to SSE. I have made few changes and I am trying to run the same code on Intel Atom processors. The problem I am facing a serious performance degrade when I switch from C to SSE on both x86 emulator as well as on Lenovo and ASUS phones with Atom processors. In order to detect the problem I am trying to profile my Intel SSE code with VTune amplifier. Your comments are of great help to us. – Harrisson Jun 16 '14 at 15:35
  • 3
    At present I dont have any hardware, Is it possible to profile the code on x86 Emulator itself(As I noticed the performance degrade from c to SSE on emulator as well). You seem to be very well versed in Intel Architecture can you suggest some papers or links for optimizing video based app on Intel Saltwell architecture for Atom processors. Like cache optimization, Intel Intrinsic optimization for Atom processors. – Harrisson Jun 16 '14 at 15:36
  • 3
    I would not trust measurements done in an emulator. If you use `HAXM` you will measure the execution speed of your host CPU. Without HAXM you measure how good the QEMU emulator is translating the code. Usually I'm using the `Intel® Architecture Code Analyzer` for such measurements. Unfortunately it doesn't support Atom Processors. – Alexander Weggerle Jun 17 '14 at 08:01
  • 3
    How did you wrote your SSE code? Intrinsics or inline assembler? If you have used inline assembler I guess the in-order architecture is the issue. Rewriting it in intrinsics would be a good idea, because the compiler can apply some optimizations like reordering the code. – Alexander Weggerle Jun 17 '14 at 08:04
  • 3
    I have written my entire code in intrinsics. Do you have any articles or documents that are bound get a performance gain on Intel Atom architecture. I know about the tools offered by Intel, but are there any specific optimizations that are known to offer a performance boost. – Harrisson Jun 21 '14 at 05:32
  • 3
    There is the [optimization manual](https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf) which has a dedicated section about Atom. This is rather low level. There are two good blog post from the Virtualdub team about optimization for Atom which I would recommend to read: http://virtualdub.org/blog/pivot/entry.php?id=286 , http://virtualdub.org/blog/pivot/entry.php?id=287 – Alexander Weggerle Jun 23 '14 at 07:11