Difference between revisions of "Guide:POINTTutorial"

From Tau Wiki
Jump to: navigation, search
Line 39: Line 39:
 
===ParaProf===
 
===ParaProf===
  
Let's view this profile in TAU ParaProf profile viewer
+
Let's view this profile in TAU's ParaProf profile viewer
 
  paraprof
 
  paraprof
  
Line 59: Line 59:
 
  Select "Create Selective Instrumentation File"
 
  Select "Create Selective Instrumentation File"
  
A window will pop up showing a number of routines. These routine have been flagged by TAU as lightweight routine. Lightweight routines are defined as routines that have less than 10 microseconds per call and are called more than 100,000 times (these parameters can be changed--see the form above). Excluding these routines from instrumentation will help lower the instrumentation overhead.
+
A window will pop up showing a number of routines. These routine have been flagged by TAU as lightweight routines. Lightweight routines are defined as routines that have less than 10 microseconds per call and are called more than 100,000 times (these parameters can be changed--see the form above). Excluding these routines from instrumentation will help lower the instrumentation overhead.
 
   
 
   
 
  Click "Save"
 
  Click "Save"
Line 89: Line 89:
 
==Comparing Trials==
 
==Comparing Trials==
  
ParaProf is also very useful for comparing different run of the same application. So far we have been compiling the NPB applications with a high level of optimization (-O3), can we quantify the performance benefit of doing so?
+
ParaProf is also very useful for comparing different runs of the same application. So far we have been compiling the NPB applications with a high level of optimization (-O3), can we quantify the performance benefit of doing so?
  
 
  close ParaProf
 
  close ParaProf

Revision as of 22:00, 19 May 2009

A short Demo of TAU for use with the POINT Live DVD version 2, built 5/18/09.

Instrumenting NPB

open a terminal window
cd workshop-point/NPB3.1
vi config/make.def

Notice that the MPIF77 variable is set to tau_f90.sh. This will enable TAU's automatic instrumentation with PDT.

Setting the TAU makefile

close vi.
setenv TAU_MAKEFILE=$TAU/Makefile-tau-mpi-pdt

This tells TAU to perform a basic instrumentation using PDT and the TAU MPI wrapper library. Now build the BT example program:

make bt CLASS=S NPROCS=1

Running the NPB example

cd bin
mpirun -np 1 ./bt.S.1

TAU Profiles will automatically be generated in the current directory, one profile file per thread.

ls 
bt.S.1  profile.0.0.0


Viewing TAU profiles

To get a simple summary of the TAU profiles type:

pprof

This gives you a basic idea of how much time was spent in different routines in the application.

ParaProf

Let's view this profile in TAU's ParaProf profile viewer

paraprof

Paraprof will load the profile and show a single bar representing Node 0. Each colored subsection represents a different routine in BT program. The length of a subsection is proportional to the exclusive time spent in that routine.

Right click on the "Node 0" label
Select "Show Thread Bar Chart"

A new window will pop up ordering each routine by the amount of exclusive time.

Click "Options" -> "Select Metric..." -> "Inclusive"

Now the bars are ordered by inclusive time.

Selective Instrumentation

Click on the "TAU: ParaProf Manager" window
Right click on the trial name: "bin/NPB3.1/...."
Select "Create Selective Instrumentation File"

A window will pop up showing a number of routines. These routine have been flagged by TAU as lightweight routines. Lightweight routines are defined as routines that have less than 10 microseconds per call and are called more than 100,000 times (these parameters can be changed--see the form above). Excluding these routines from instrumentation will help lower the instrumentation overhead.

Click "Save"
Close ParaProf

Tell TAU to use this newly created selective instrumentation file:

setenv TAU_OPTIONS -optTauSelectFile=`pwd`/select.tau

Now rebuild the BT program.

cd ..
make clean bt CLASS=S NPROCS=1

This time let's run the program with callpath profiling enabled:

setenv TAU_CALLPATH 1
cd bin
mpirun -np 1 ./bt.S.1

Open ParaProf:

paraprof
Right click on 'Node 0'
Select "Show Thread Statistics Table'
Click on 'MPBT' routine

Here we can see that the MPBT routine calls INITIALIZE twice for a total of a few hundred milliseconds.

Comparing Trials

ParaProf is also very useful for comparing different runs of the same application. So far we have been compiling the NPB applications with a high level of optimization (-O3), can we quantify the performance benefit of doing so?

close ParaProf
cd ..
vi config/make.def
Comment out line 50, "FFLAGS = -O3"
write file and close vi
make clean bt CLASS=S NPROCS=1

Before running the experiment let's package the performance data we have already gathered:

cd bin
paraprof --pack bt-O3.ppk

mpirun -np 1 ./bt.S.1
paraprof --pack bt.ppk

Now open both profiles in ParaProf

paraprof *.ppk
Click on 'TAU: ParaProf Manager'
Right click on the 'bt.ppk' trial
Select 'Add Mean to Comparison Window'
Right click on the 'bt-O3.ppk' trial
Select 'Add Mean to Comparison Window'

In the Comparison Window we can see a comparison between these two runs. Each bar shows the exclusive time for each routine. Some routine show little variation (MPI_Init) while some show a huge speedup when compiled with -O3 (X_BACKSUBSTITUTE).