Keeneland
Jump to navigation
Jump to search
Guide for using TAU on Keeneland
Slide about TAU
Setting up environment
setup your environment this way:
module load tau export TAU_MAKEFILE=$TAUROOT/lib/Makefile.tau-cupti-pdt
Compiling SHOC 1.0.1 with TAU
After configuring SHOC edit the config/common.mk to:
# === Basics ===
CC = tau_cc.sh
CXX = tau_cxx.sh
LD = tau_cxx.sh
AR = /usr/bin/ar
RANLIB = ranlib
CPPFLAGS += -I$(SHOC_ROOT)/src/common -I${SHOC_ROOT}/config
CFLAGS += -m64 -g -O2
CXXFLAGS += -m64 -g -O2
ARFLAGS = rcv
LDFLAGS =
LIBS = -L$(SHOC_ROOT)/lib -lrt -L/sw/keeneland/cuda/3.2RC/lib64/ -lcudart
USE_MPI = no
OCL_CPPFLAGS += -I${SHOC_ROOT}/src/opencl/common
OCL_LIBS =
NVCC = /sw/keeneland/cuda/3.2/bin/nvcc
CUDA_CXX = tau_cxx.sh
CUDA_INC = -I/sw/keeneland/cuda/3.2/include
CUDA_CPPFLAGS += -gencode=arch=compute_10,code=sm_10 \
-gencode=arch=compute_11,code=sm_11 -gencode=arch=compute_13,code=sm_13 \
-gencode=arch=compute_20,code=sm_20 -gencode=arch=compute_20,code=compute_20 \
-I${SHOC_ROOT}/src/cuda/include $(TAU_LIBS)
Then make/install as you normally would.
More info at: TAU's userguide
Building SHOC with VampirTrace
In this case edit the config/common.mk to read:
# === Basics ===
CC = vtcc --vt:cc mpicc
CXX = vtcxx --vt:cxx mpicxx
LD = vtcxx --vt:cxx mpicxx
AR = /usr/bin/ar
RANLIB = ranlib
CPPFLAGS += -I$(SHOC_ROOT)/src/common -I${SHOC_ROOT}/config
CFLAGS += -m64 -g -O2
CXXFLAGS += -m64 -g -O2
ARFLAGS = rcv
LDFLAGS =
LIBS = -L$(SHOC_ROOT)/lib -lrt -L/sw/keeneland/cuda/3.2RC/lib64/ -lcudart
USE_MPI = no
OCL_CPPFLAGS += -I${SHOC_ROOT}/src/opencl/common
OCL_LIBS =
NVCC = vtnvcc
CUDA_CXX = vtnvcc
CUDA_INC = -I/sw/keeneland/cuda/3.2/include
CUDA_CPPFLAGS += -gencode=arch=compute_10,code=sm_10 \
-gencode=arch=compute_11,code=sm_11 -gencode=arch=compute_13,code=sm_13 \
-gencode=arch=compute_20,code=sm_20 -gencode=arch=compute_20,code=compute_20 \
-I${SHOC_ROOT}/src/cuda/include $(TAU_LIBS)
Running CUDA applications
Both CUDA and OpenCL are instrumented dynamically through library preloading, use the tau_exec script to run the CUDA application:
%> tau_exec -T serial,cupti -cupti ./Stencil2D
The -T serial specifies with TAU configuration to use, you can change this for MPI applications and run:
%> mpirun -np 4 tau_exec -T mpi,cupti -cupti ./SGEMM
This could be done with executables build with or without TAU.
Traces
Traces can be recorded by first setting:
%> export TAU_TRACE=1 %> tau_exec -T serial,cupti -cupti ./Stencil2D %> tau_multimerge %> tau2slog2 tau.trc tau.edf -o stencil2d.slog2 %> jumpshot stencil2d.slog2
Running OpenCL applications
Use tau_exec as well:
%> tau_exec -T serial -opencl ./SGEMM
Performance Data
Some example performance data from S3D: