Difference between revisions of "Cray"

From Tau Wiki
Jump to: navigation, search
(Cray XK6/XK7)
(GPU performance tracking)
 
(2 intermediate revisions by the same user not shown)
Line 92: Line 92:
 
-->
 
-->
 
== Cray XK6/XK7 ==
 
== Cray XK6/XK7 ==
 
Cray XK6 machines
 
  
 
=== GPU performance tracking ===
 
=== GPU performance tracking ===
Line 114: Line 112:
 
3. OpenACC
 
3. OpenACC
  
Both PGI and Cray uses the CUDA driver API to interact with the GPU, so setup TAU to collect those calls:
+
Remember to load the cray acc module:
 +
 
 +
module load  craype-accel-nvidia20
 +
 
 +
Both PGI and Cray compilers uses the CUDA driver API to interact with the GPU, so to setup TAU to collect those calls:
  
 
  export TAU_CUPTI_API=driver
 
  export TAU_CUPTI_API=driver
  
Compile as normally would and run with tau_exec as well:
+
Compile as normally would and run with tau_exec:
  
 
  aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult
 
  aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult
Line 124: Line 126:
 
4. Viewing profiles
 
4. Viewing profiles
  
You can view TAU profile either through '''pprof''' (text-basied) or '''paraprof''' (GUI).
+
TAU profile are written to disk as '''profile.*''' (you may have several files.) You can view TAU profiles either through '''pprof''' (text-basied) or '''paraprof''' (GUI).
  
 
5. Tracing
 
5. Tracing
Line 133: Line 135:
  
 
before running your application. The traces need to be post-processed as well, issue these commands:
 
before running your application. The traces need to be post-processed as well, issue these commands:
 
  
 
  aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult
 
  aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult
 
  tau_multimerge
 
  tau_multimerge
  tau2slog2 tau.trc tau.edf -o stencil2d.slog2
+
  tau2slog2 tau.trc tau.edf -o matmult.slog2
  jumpshot stencil2d.slog2
+
  jumpshot matmult.slog2
  
Jumpshot is a commond Trace visualizer bundled with TAU.
+
Jumpshot is a common Trace visualizer bundled with TAU.

Latest revision as of 18:21, 25 October 2012

Cray XK6/XK7

GPU performance tracking

1. Configuring TAU:

module load cudatoolkit
./configure -arch=craycnl -cuda="$CRAY_CUDATOOLKIT_DIR" -cudalibrary="$CRAY_CUDATOOLKIT_POST_LINK_OPTS" -bfd=download

Setup your environment:

export PATH=<path to tau2>/craycnl/bin:$PATH
export LD_LIBRARY_PATH=<path to tau2>/craycnl/lib:$LD_LIBRARY_PATH

2. CUDA

Build as normally would, and modify your run command to be:

aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult

3. OpenACC

Remember to load the cray acc module:

module load  craype-accel-nvidia20

Both PGI and Cray compilers uses the CUDA driver API to interact with the GPU, so to setup TAU to collect those calls:

export TAU_CUPTI_API=driver

Compile as normally would and run with tau_exec:

aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult

4. Viewing profiles

TAU profile are written to disk as profile.* (you may have several files.) You can view TAU profiles either through pprof (text-basied) or paraprof (GUI).

5. Tracing

Traces can be capture by setting:

export TAU_TRACE=1 

before running your application. The traces need to be post-processed as well, issue these commands:

aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult
tau_multimerge
tau2slog2 tau.trc tau.edf -o matmult.slog2
jumpshot matmult.slog2

Jumpshot is a common Trace visualizer bundled with TAU.