-2

I intend to profile the Community Earth System Model (CESM) on a cluster of 8 nodes. I am able to successfully profile the application using HPCToolkit

I am able to get only two metrics being CPU Time(I) and CPU Time(E). I am interested in getting metrics like number of function calls and wall clock time. How do I extract such metrics using HPCToolkit?

This is the other information required:

1) System Information

OS/Architecture [nitin@master ~]$ uname -a Linux master.ipoc.org 2.6.32-358.el6.x86_64 #1 SMP Fri Feb 22 00:31:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Compiler : I am using the Intel family of compilers [nitin@master ~]$ icc -V Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.2.164 Build 20150121 Copyright (C) 1985-2015 Intel Corporation. All rights reserved.

PAPI : I have not installed PAPI. Primarily because of facing a error relating to the installation in the make process. I am guessing it is because of the issue relating to libpfm as told in http://icl.cs.utk.edu/papi/faq/#264

Java [nitin@master ~]$ java -version java version "1.7.0_09-icedtea" OpenJDK Runtime Environment (rhel-2.3.4.1.el6_3-x86_64) OpenJDK 64-Bit Server VM (build 23.2-b09, mixed mode)

2) HPCToolkit (Looks like I am using HPCToolkit Version 5.3.2 [2012.09.21]. I saw this as the latest revision in the README.Releasenotes)

3) Profiled Application: The application is a complex application called the Community Earth System Model (CESM). It has several components spread over many fortan files. The code is primarily in Fortran. I am not using hpclink and I am using hpcrun directly in the mpirun command. I run the code on a cluster of 8 nodes (each with 16 cores). Hence, it looks like I have dynamically linked the application.

1 Answers1

0

HPC Toolkit will not get you the number of function calls. It is a sampling based profiler, not a "log everything" profiler.

If you need to know exact function call counts then you will need to instrument the code or use something else that is built to answer your question, like gprof or callgrind (although not easy or fast doing that for HPC applications). For a code like CESM you would probably not gain anything from using those.

For wall clock I expect it will provide this, so suggest you wait for the HPC toolkit forum to answer you.

David
  • 756
  • 5
  • 10