2.6. Performance Profiling

New in version 6.0.0.

SUNDIALS includes a lightweight performance profiling layer that can be enabled at compile-time. Optionally, this profiling layer can leverage Caliper [17] for more advanced instrumentation and profiling. By default, only SUNDIALS library code is profiled. However, a public profiling API can be utilized to leverage the SUNDIALS profiler to time user code regions as well (see §2.6.2).

2.6.1. Enabling Profiling

To enable profiling, SUNDIALS must be built with the CMake option SUNDIALS_BUILD_WITH_PROFILING set to ON. To utilize Caliper support, the CMake option ENABLE_CALIPER must also be set to ON. More details in regards to configuring SUNDIALS with CMake can be found in §2.1.

When SUNDIALS is built with profiling enabled and without Caliper, then the environment variable SUNPROFILER_PRINT can be utilized to enable/disable the printing of profiler information. Setting SUNPROFILER_PRINT=1 will cause the profiling information to be printed to stdout when the SUNDIALS simulation context is freed. Setting SUNPROFILER_PRINT=0 will result in no profiling information being printed unless the SUNProfiler_Print() function is called explicitly. By default, SUNPROFILER_PRINT is assumed to be 0. SUNPROFILER_PRINT can also be set to a file path where the output should be printed.

If Caliper is enabled, then users should refer to the Caliper documentation for information on getting profiler output. In most cases, this involves setting the CALI_CONFIG environment variable.

Note

The SUNDIALS profiler requires POSIX timers or the Windows profileapi.h timers.

Warning

While the SUNDIALS profiling scheme is relatively lightweight, enabling profiling can still negatively impact performance. As such, it is recommended that profiling is enabled judiciously.

2.6.2. Profiler API

The primary way of interacting with the SUNDIALS profiler is through the following macros:

SUNDIALS_MARK_FUNCTION_BEGIN(profobj)
SUNDIALS_MARK_FUNCTION_END(profobj)
SUNDIALS_WRAP_STATEMENT(profobj, name, stmt)
SUNDIALS_MARK_BEGIN(profobj, name)
SUNDIALS_MARK_END(profobj, name)

Additionally, in C++ applications, the follow macro is available:

SUNDIALS_CXX_MARK_FUNCTION(profobj)

These macros can be used to time specific functions or code regions. When using the *_BEGIN macros, it is important that a matching *_END macro is placed at all exit points for the scope/function. The SUNDIALS_CXX_MARK_FUNCTION macro only needs to be placed at the beginning of a function, and leverages RAII to implicitly end the region.

The profobj argument to the macro should be a SUNProfiler object, i.e. an instance of the struct

typedef struct SUNProfiler_ *SUNProfiler

When SUNDIALS is built with profiling, a default profiling object is stored in the SUNContext object and can be accessed with a call to SUNContext_GetProfiler().

The name argument should be a unique string indicating the name of the region/function. It is important that the name given to the *_BEGIN macros matches the name given to the *_END macros.

In addition to the macros, the following methods of the SUNProfiler class are available.

int SUNProfiler_Create(SUNComm comm, const char *title, SUNProfiler *p)

Creates a new SUNProfiler object.

Arguments:
  • comm – the MPI communicator to use, if MPI is enabled, otherwise can be SUN_COMM_NULL.

  • title – a title or description of the profiler

  • p – [in,out] On input this is a pointer to a SUNProfiler, on output it will point to a new SUNProfiler instance

Returns:
  • Returns zero if successful, or non-zero if an error occurred

int SUNProfiler_Free(SUNProfiler *p)

Frees a SUNProfiler object.

Arguments:
  • p – [in,out] On input this is a pointer to a SUNProfiler, on output it will be NULL

Returns:
  • Returns zero if successful, or non-zero if an error occurred

int SUNProfiler_Begin(SUNProfiler p, const char *name)

Starts timing the region indicated by the name.

Arguments:
  • p – a SUNProfiler object

  • name – a name for the profiling region

Returns:
  • Returns zero if successful, or non-zero if an error occurred

int SUNProfiler_End(SUNProfiler p, const char *name)

Ends the timing of a region indicated by the name.

Arguments:
  • p – a SUNProfiler object

  • name – a name for the profiling region

Returns:
  • Returns zero if successful, or non-zero if an error occurred

int SUNProfiler_GetElapsedTime(SUNProfiler p, const char *name, double *time)

Get the elapsed time for the timer “name” in seconds.

Arguments:
  • p – a SUNProfiler object

  • name – the name for the profiling region of interest

  • time – upon return, the elapsed time for the timer

Returns:
  • Returns zero if successful, or non-zero if an error occurred

int SUNProfiler_GetTimerResolution(SUNProfiler p, double *resolution)

Get the timer resolution in seconds.

Arguments:
  • p – a SUNProfiler object

  • resolution – upon return, the resolution for the timer

Returns:
  • Returns zero if successful, or non-zero if an error occurred

int SUNProfiler_Print(SUNProfiler p, FILE *fp)

Prints out a profiling summary. When constructed with an MPI comm the summary will include the average and maximum time per rank (in seconds) spent in each marked up region.

Arguments:
  • p – a SUNProfiler object

  • fp – the file handler to print to

Returns:
  • Returns zero if successful, or non-zero if an error occurred

int SUNProfiler_Reset(SUNProfiler p)

Resets the region timings and counters to zero.

Arguments:
  • p – a SUNProfiler object

Returns:
  • Returns zero if successful, or non-zero if an error occurred

2.6.3. Example Usage

The following is an excerpt from the CVODE example code examples/cvode/serial/cvAdvDiff_bnd.c. It is applicable to any of the SUNDIALS solver packages.

SUNContext ctx;
SUNProfiler profobj;

/* Create the SUNDIALS context */
retval = SUNContext_Create(SUN_COMM_NULL, &ctx);

/* Get a reference to the profiler */
retval = SUNContext_GetProfiler(ctx, &profobj);

/* ... */

SUNDIALS_MARK_BEGIN(profobj, "Integration loop");
umax = N_VMaxNorm(u);
PrintHeader(reltol, abstol, umax);
for(iout=1, tout=T1; iout <= NOUT; iout++, tout += DTOUT) {
   retval = CVode(cvode_mem, tout, u, &t, CV_NORMAL);
   umax = N_VMaxNorm(u);
   retval = CVodeGetNumSteps(cvode_mem, &nst);
   PrintOutput(t, umax, nst);
}
SUNDIALS_MARK_END(profobj, "Integration loop");
PrintFinalStats(cvode_mem);  /* Print some final statistics   */

2.6.4. Other Considerations

If many regions are being timed, it may be necessary to increase the maximum number of profiler entries (the default is 2560). This can be done by setting the environment variable SUNPROFILER_MAX_ENTRIES.