I intend to profile the Community Earth System Model (CESM) on a cluster of 8 nodes. I am able to successfully profile the application using HPCToolkit
I am able to get only two metrics being CPU Time(I) and CPU Time(E). I am interested in getting metrics like number of function calls and wall clock time. How do I extract such metrics using HPCToolkit?
This is the other information required:
1) System Information
OS/Architecture [nitin@master ~]$ uname -a Linux master.ipoc.org 2.6.32-358.el6.x86_64 #1 SMP Fri Feb 22 00:31:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
Compiler : I am using the Intel family of compilers [nitin@master ~]$ icc -V Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.2.164 Build 20150121 Copyright (C) 1985-2015 Intel Corporation. All rights reserved.
PAPI : I have not installed PAPI. Primarily because of facing a error relating to the installation in the make process. I am guessing it is because of the issue relating to libpfm as told in http://icl.cs.utk.edu/papi/faq/#264
Java [nitin@master ~]$ java -version java version "1.7.0_09-icedtea" OpenJDK Runtime Environment (rhel-2.3.4.1.el6_3-x86_64) OpenJDK 64-Bit Server VM (build 23.2-b09, mixed mode)
2) HPCToolkit (Looks like I am using HPCToolkit Version 5.3.2 [2012.09.21]. I saw this as the latest revision in the README.Releasenotes)
3) Profiled Application: The application is a complex application called the Community Earth System Model (CESM). It has several components spread over many fortan files. The code is primarily in Fortran. I am not using hpclink and I am using hpcrun directly in the mpirun command. I run the code on a cluster of 8 nodes (each with 16 cores). Hence, it looks like I have dynamically linked the application.