MPAS-Ocean

From Tau Wiki
Revision as of 21:52, 2 November 2012 by Khuck (talk | contribs) (Detailed analysis of 128 process callpath profile)
Jump to: navigation, search


Overview

This is the TAU profiling MPAS-Ocean page.

MPAS-Ocean Sourceforge Page

The MPAS-Ocean code has been modified to use TAU as the timers, rather than the internal timers. This provides for both MPI performance measurement as well as PAPI counters.

The MPAS-Ocean developers have collected profiles on Hopper, with 192 to 16800 processes, using MPI only (no OpenMP yet). In addition, full callpath and communication matrix profiles with 128 processes on Hopper have been collected.

Those profiles are available here: ParaProf, PerfExplorer. The client applications can only connect to the performance database from specific domains, and with authenticated access. Please contact the TAU team to request access to the raw data.

Below is a brief analysis of the application performance.

Performance Analysis

Scaling behavior

As mentioned before, the application was executed with 192 through 16800 processes in a strong scaling study (the total problem size did not change).

insert scaling figure here

Broken down by timed regions, the scaling behavior is this:

insert scaling figure here

Clearly, MPI_Wait is overly dominant, and as we shall see in the per-trial analysis, varies considerably across processes.

Detailed analysis of 128 process callpath profile

Treetable.png

Profile-imbalance.png

Detailed analysis of 128 process flat profile with communication matrix