Difference between revisions of "Keeneland"

From Tau Wiki
Jump to: navigation, search
Line 18: Line 18:
  
 
After configuring SHOC edit the '''config/common.mk''' to:
 
After configuring SHOC edit the '''config/common.mk''' to:
 +
  
 
     # === Basics ===
 
     # === Basics ===
    #CC      = gcc
 
    #CXX      = g++
 
    #LD      = g++
 
 
     CC      = tau_cc.sh
 
     CC      = tau_cc.sh
 
     CXX      = tau_cxx.sh
 
     CXX      = tau_cxx.sh
Line 41: Line 39:
 
     OCL_LIBS        =
 
     OCL_LIBS        =
 
    
 
    
    #CUDA_CXX        = /sw/keeneland/cuda/3.2/bin/nvcc
 
    #NVCC            =  tau_cxx.sh
 
 
     NVCC            = /sw/keeneland/cuda/3.2/bin/nvcc
 
     NVCC            = /sw/keeneland/cuda/3.2/bin/nvcc
 
     CUDA_CXX        =  tau_cxx.sh
 
     CUDA_CXX        =  tau_cxx.sh
Line 49: Line 45:
 
     -gencode=arch=compute_11,code=sm_11  -gencode=arch=compute_13,code=sm_13 \
 
     -gencode=arch=compute_11,code=sm_11  -gencode=arch=compute_13,code=sm_13 \
 
     -gencode=arch=compute_20,code=sm_20  -gencode=arch=compute_20,code=compute_20 \
 
     -gencode=arch=compute_20,code=sm_20  -gencode=arch=compute_20,code=compute_20 \
     -I${SHOC_ROOT}/src/cuda/include
+
     -I${SHOC_ROOT}/src/cuda/include $(TAU_LIBS)
 
    
 
    
  
Line 55: Line 51:
  
 
More info at: [http://www.cs.uoregon.edu/research/tau/docs/newguide/bk01ch01s02.html TAU's userguide]
 
More info at: [http://www.cs.uoregon.edu/research/tau/docs/newguide/bk01ch01s02.html TAU's userguide]
 
 
== Running with tau_exec ==
 
 
For a quick profile/trace try the '''tau_exec''' [http://www.cs.uoregon.edu/research/tau/docs/newguide/re17.html script].
 
  
  
Line 71: Line 62:
 
    
 
    
 
     %> mpirun -np 4 tau_exec -T mpi -cuda ./SGEMM
 
     %> mpirun -np 4 tau_exec -T mpi -cuda ./SGEMM
 +
 +
This could be done with executable build with or without TAU.
 +
 +
=== Traces ===
 +
 +
Traces can be recorded by first setting:
 +
 +
    %> export TAU_TRACE=1
 +
    %> tau_exec -T serial -cuda ./Stencil2D
 +
    %> tau_multimerge
 +
    %> tau2slog2 tau.trc tau.ed -o stencil2d.slog2
 +
    %> jumpshot
  
 
=== Trouble-shooting ===
 
=== Trouble-shooting ===

Revision as of 05:50, 28 January 2011

Guide for using TAU on Keeneland

Slide about TAU

TAU overview slides


Setting up environment

We'll have a module for TAU setup shortly but for now, setup your environment this way:

   %> export PATH=/nics/c/home/biersdor/tau2/x86_64/bin/:$PATH
   %> export LD_LIBRARY_PATH=/nics/c/home/biersdor/tau2/x86_64/lib/:$LD_LIBRARY_PATH
   %> export TAU_MAKEFILE=/nics/c/home/biersdor/tau2/x86_64/lib/Makefile.tau-pdt

Compiling SHOC 1.0.1 with TAU

After configuring SHOC edit the config/common.mk to:


   # === Basics ===
   CC       = tau_cc.sh
   CXX      = tau_cxx.sh
   LD       = tau_cxx.sh
   AR       = /usr/bin/ar
   RANLIB   = ranlib
  
   CPPFLAGS += -I$(SHOC_ROOT)/src/common -I${SHOC_ROOT}/config
   CFLAGS   += -m64 -g -O2
   CXXFLAGS += -m64 -g -O2
   ARFLAGS  = rcv
   LDFLAGS  =
   LIBS     = -L$(SHOC_ROOT)/lib  -lrt -L/sw/keeneland/cuda/3.2RC/lib64/ -lcudart
  
   USE_MPI         = no
  
   OCL_CPPFLAGS    += -I${SHOC_ROOT}/src/opencl/common
   OCL_LIBS        =
  
   NVCC            = /sw/keeneland/cuda/3.2/bin/nvcc
   CUDA_CXX        =  tau_cxx.sh
   CUDA_INC        = -I/sw/keeneland/cuda/3.2/include
   CUDA_CPPFLAGS   += -gencode=arch=compute_10,code=sm_10 \
   -gencode=arch=compute_11,code=sm_11  -gencode=arch=compute_13,code=sm_13 \
   -gencode=arch=compute_20,code=sm_20  -gencode=arch=compute_20,code=compute_20 \
   -I${SHOC_ROOT}/src/cuda/include $(TAU_LIBS)
  

Then make/install as you normally would.

More info at: TAU's userguide


Running CUDA applications

Both CUDA and OpenCL are instrumented dynamically through library preloading, use the tau_exec script to run the CUDA application:

   %> tau_exec -T serial -cuda ./Stencil2D

The -T serial specifies with TAU configuration to use, you can change this for MPI applications and run:

   %> mpirun -np 4 tau_exec -T mpi -cuda ./SGEMM

This could be done with executable build with or without TAU.

Traces

Traces can be recorded by first setting:

   %> export TAU_TRACE=1
   %> tau_exec -T serial -cuda ./Stencil2D
   %> tau_multimerge
   %> tau2slog2 tau.trc tau.ed -o stencil2d.slog2
   %> jumpshot

Trouble-shooting

  • CPU side looks fine but no GPU profile/trace generated.

This is likely because there is no cudaThreadExit() call at the end the application. By placing one there this will signal TAU that the applications CUDA accelerated section is finished and it can go ahead and write out the profile/trace.

Fix: Place cudaThreadExit() at the end of the application.

  • Receiving Error calculating kernel event [start|stop], error #: 33. during execution.

This means that CUDA could not retrieve the event object at synchronization. Try placing the synchronize event right after the kernel is launched. In some cases no configuration of kernel launches/synchronization points will suffice, and although this one kernel could not be tracked any other ones taking place in the application should be tracked correctly.

Fix: Try placing a synchronization called right after the kernel launch.


Running OpenCL applications

Use tau_exec as well:

   %> tau_exec -T serial -opencl ./SGEMM 

CUpti and PAPI

Coming soon...