2.4.6. Preconditioner modules
The efficiency of Krylov iterative methods for the solution of linear systems can be greatly enhanced through preconditioning. For problems in which the user cannot define a more effective, problemspecific preconditioner, ARKODE provides two internal preconditioner modules: a banded preconditioner for serial and threaded problems (ARKBANDPRE) and a bandblockdiagonal preconditioner for parallel problems (ARKBBDPRE).
2.4.6.1. A serial banded preconditioner module
This preconditioner provides a band matrix preconditioner for use with iterative SUNLINSOL modules in a serial or threaded setting. It requires that the problem be set up using either the NVECTOR_SERIAL, NVECTOR_OPENMP or NVECTOR_PTHREADS module, due to data access patterns. It also currently requires that the problem involve an identity mass matrix, i.e., \(M = I\).
This module uses difference quotients of the ODE righthand
side function \(f^I\) to generate a band matrix of bandwidth
ml + mu + 1
, where the number of superdiagonals (mu
, the
upper halfbandwidth) and subdiagonals (ml
, the lower
halfbandwidth) are specified by the user. This band matrix is used
to to form a preconditioner the Krylov linear solver. Although this
matrix is intended to approximate the Jacobian
\(J = \dfrac{\partial f^I}{\partial y}\), it may be a very crude
approximation, since the true Jacobian may not be banded, or its true
bandwidth may be larger than ml + mu + 1
. However, as long as the
banded approximation generated for the preconditioner is sufficiently
accurate, it may speed convergence of the Krylov iteration.
2.4.6.1.1. ARKBANDPRE usage
In order to use the ARKBANDPRE module, the user need not define
any additional functions. In addition to the header files required
for the integration of the ODE problem (see
§2.4.1), to use the ARKBANDPRE module, the user’s
program must include the header file arkode_bandpre.h
which
declares the needed function prototypes. The following is a summary
of the usage of this module. Steps that are unchanged from the
skeleton program presented in §2.4.2 are
italicized.
Initialize multithreaded environment (if appropriate)
Create the SUNDIALS simulation context object.
Set problem dimensions
Set vector of initial values
Create ARKODE object
Specify integration tolerances
Create iterative linear solver object
When creating the iterative linear solver object, specify the type of preconditioning (
SUN_PREC_LEFT
orSUN_PREC_RIGHT
) to use.Set linear solver optional inputs
Attach linear solver module
Initialize the ARKBANDPRE preconditioner module
Specify the upper and lower halfbandwidths (
mu
andml
, respectively) and callier = ARKBandPrecInit(arkode_mem, N, mu, ml);
to allocate memory and initialize the internal preconditioner data.
Create nonlinear solver object
Attach nonlinear solver module
Set nonlinear solver optional inputs
Set optional inputs
Note that the user should not call
ARKodeSetPreconditioner()
as it will overwrite the preconditioner setup and solve functions.Specify rootfinding problem
Advance solution in time
Get optional outputs
Additional optional outputs associated with ARKBANDPRE are available by way of the two routines described below,
ARKBandPrecGetWorkSpace()
andARKBandPrecGetNumRhsEvals()
.Deallocate memory for solution vector
Free solver memory
Free linear solver memory
Free nonlinear solver memory
2.4.6.1.2. ARKBANDPRE usercallable functions
The ARKBANDPRE preconditioner module is initialized and attached by calling the following function:

int ARKBandPrecInit(void *arkode_mem, sunindextype N, sunindextype mu, sunindextype ml)
Initializes the ARKBANDPRE preconditioner and allocates required (internal) memory for it.
 Parameters:
arkode_mem – pointer to the ARKODE memory block.
N – problem dimension (size of ODE system).
mu – upper halfbandwidth of the Jacobian approximation.
ml – lower halfbandwidth of the Jacobian approximation.
 Return values:
ARKLS_SUCCESS – the function exited successfully.
ARKLS_MEM_NULL –
arkode_mem
wasNULL
.ARKLS_LMEM_NULL – the linear solver memory was
NULL
.ARKLS_ILL_INPUT – an input had an illegal value.
ARKLS_MEM_FAIL – a memory allocation request failed.
Note
The banded approximate Jacobian will have nonzero elements only in locations \((i,j)\) with ml \(\le ji \le\) mu.
The following two optional output functions are available for use with the ARKBANDPRE module:

int ARKBandPrecGetWorkSpace(void *arkode_mem, long int *lenrwLS, long int *leniwLS)
Returns the sizes of the ARKBANDPRE real and integer workspaces.
 Parameters:
arkode_mem – pointer to the ARKODE memory block.
lenrwLS – the number of
sunrealtype
values in the ARKBANDPRE workspace.leniwLS – the number of integer values in the ARKBANDPRE workspace.
 Return values:
ARKLS_SUCCESS – the function exited successfully.
ARKLS_MEM_NULL –
arkode_mem
wasNULL
.ARKLS_LMEM_NULL – the linear solver memory was
NULL
.ARKLS_PMEM_NULL – the preconditioner memory was
NULL
.
Note
The workspace requirements reported by this routine correspond only to memory allocated within the ARKBANDPRE module (the banded matrix approximation, banded
SUNLinearSolver
object, and temporary vectors).The workspaces referred to here exist in addition to those given by the corresponding function
ARKodeGetLinWorkSpace()
.

int ARKBandPrecGetNumRhsEvals(void *arkode_mem, long int *nfevalsBP)
Returns the number of calls made to the usersupplied righthand side function \(f^I\) for constructing the finitedifference banded Jacobian approximation used within the preconditioner setup function.
 Parameters:
arkode_mem – pointer to the ARKODE memory block.
nfevalsBP – number of calls to \(f^I\).
 Return values:
ARKLS_SUCCESS – the function exited successfully.
ARKLS_MEM_NULL –
arkode_mem
wasNULL
.ARKLS_LMEM_NULL – the linear solver memory was
NULL
.ARKLS_PMEM_NULL – the preconditioner memory was
NULL
.
Note
The counter nfevalsBP is distinct from the counter nfevalsLS returned by the corresponding function
ARKodeGetNumLinRhsEvals()
and also from the number of evaluations returned by the timestepping module (e.g., nfi_evals returned byARKStepGetNumRhsEvals()
). The total number of righthand side function evaluations is the sum of all three of these counters.
2.4.6.2. A parallel bandblockdiagonal preconditioner module
A principal reason for using a parallel ODE solver (such as ARKODE) lies in the solution of partial differential equations (PDEs). Moreover, Krylov iterative methods are used on many such problems due to the nature of the underlying linear system of equations that needs to solved at each time step. For many PDEs, the linear algebraic system is large, sparse and structured. However, if a Krylov iterative method is to be effective in this setting, then a nontrivial preconditioner is required. Otherwise, the rate of convergence of the Krylov iterative method is usually slow, and degrades as the PDE mesh is refined. Typically, an effective preconditioner must be problemspecific.
However, we have developed one type of preconditioner that treats a rather broad class of PDEbased problems. It has been successfully used with CVODE for several realistic, largescale problems [71], and is included in a software module within the ARKODE package. This preconditioning module works with the parallel vector module NVECTOR_PARALLEL and is usable with any of the Krylov iterative linear solvers through the ARKLS interface. It generates a preconditioner that is a blockdiagonal matrix with each block being a band matrix. The blocks need not have the same number of super and subdiagonals and these numbers may vary from block to block. This BandBlockDiagonal Preconditioner module is called ARKBBDPRE.
One way to envision these preconditioners is to think of the computational PDE domain as being subdivided into \(Q\) nonoverlapping subdomains, where each subdomain is assigned to one of the \(Q\) MPI tasks used to solve the ODE system. The basic idea is to isolate the preconditioning so that it is local to each process, and also to use a (possibly cheaper) approximate righthand side function for construction of this preconditioning matrix. This requires the definition of a new function \(g(t,y) \approx f^I(t,y)\) that will be used to construct the BBD preconditioner matrix. At present, we assume that the ODE be written in explicit form as
where \(f^I\) corresponds to the ODE components to be treated implicitly, i.e. this preconditioning module does not support problems with nonidentity mass matrices. The user may set \(g = f^I\), if no less expensive approximation is desired.
Corresponding to the domain decomposition, there is a decomposition of the solution vector \(y\) into \(Q\) disjoint blocks \(y_q\), and a decomposition of \(g\) into blocks \(g_q\). The block \(g_q\) depends both on \(y_p\) and on components of blocks \(y_{q'}\) associated with neighboring subdomains (socalled ghostcell data). If we let \(\bar{y}_q\) denote \(y_q\) augmented with those other components on which \(g_q\) depends, then we have
and each of the blocks \(g_q(t,\bar{y}_q)\) is decoupled from one another.
The preconditioner associated with this decomposition has the form
where
and where \(J_q\) is a difference quotient approximation to \(\dfrac{\partial g_q}{\partial \bar{y}_q}\). This matrix is taken to be banded, with upper and lower halfbandwidths mudq and mldq defined as the number of nonzero diagonals above and below the main diagonal, respectively. The difference quotient approximation is computed using mudq + mldq + 2 evaluations of \(g_m\), but only a matrix of bandwidth mukeep + mlkeep + 1 is retained. Neither pair of parameters need be the true halfbandwidths of the Jacobian of the local block of \(g\), if smaller values provide a more efficient preconditioner. The solution of the complete linear system
reduces to solving each of the distinct equations
and this is done by banded LU factorization of \(P_q\) followed by a banded backsolve.
Similar blockdiagonal preconditioners could be considered with different treatments of the blocks \(P_q\). For example, incomplete LU factorization or an iterative method could be used instead of banded LU factorization.
2.4.6.2.1. ARKBBDPRE usersupplied functions
The ARKBBDPRE module calls two userprovided functions to construct
\(P\): a required function gloc (of type ARKLocalFn()
)
which approximates the righthand side function \(g(t,y) \approx
f^I(t,y)\) and which is computed locally, and an optional function
cfn (of type ARKCommFn()
) which performs all interprocess
communication necessary to evaluate the approximate righthand side
\(g\). These are in addition to the usersupplied righthand side
function \(f^I\). Both functions take as input the same pointer
user_data that is passed by the user to
ARKodeSetUserData()
and that was passed to the user’s
function \(f^I\). The user is responsible for providing space
(presumably within user_data) for components of \(y\) that are
communicated between processes by cfn, and that are then used by
gloc, which should not do any communication.

typedef int (*ARKLocalFn)(sunindextype Nlocal, sunrealtype t, N_Vector y, N_Vector glocal, void *user_data)
This gloc function computes \(g(t,y)\). It fills the vector glocal as a function of t and y.
 Param Nlocal:
the local vector length.
 Param t:
the value of the independent variable.
 Param y:
the value of the dependent variable vector on this process.
 Param glocal:
the output vector of \(g(t,y)\) on this process.
 Param user_data:
a pointer to user data, the same as the user_data parameter passed to
ARKodeSetUserData()
. Return:
An ARKLocalFn should return 0 if successful, a positive value if a recoverable error occurred (in which case ARKODE will attempt to correct), or a negative value if it failed unrecoverably (in which case the integration is halted and
ARKodeEvolve()
will return ARK_LSETUP_FAIL).
Note
This function should assume that all interprocess communication of data needed to calculate glocal has already been done, and that this data is accessible within user data.
The case where \(g\) is mathematically identical to \(f^I\) is allowed.

typedef int (*ARKCommFn)(sunindextype Nlocal, sunrealtype t, N_Vector y, void *user_data)
This cfn function performs all interprocess communication necessary for the execution of the gloc function above, using the input vector y.
 Param Nlocal:
the local vector length.
 Param t:
the value of the independent variable.
 Param y:
the value of the dependent variable vector on this process.
 Param user_data:
a pointer to user data, the same as the user_data parameter passed to
ARKodeSetUserData()
. Return:
An ARKCommFn should return 0 if successful, a positive value if a recoverable error occurred (in which case ARKODE will attempt to correct), or a negative value if it failed unrecoverably (in which case the integration is halted and
ARKodeEvolve()
will return ARK_LSETUP_FAIL).
Note
The cfn function is expected to save communicated data in space defined within the data structure user_data.
Each call to the cfn function is preceded by a call to the righthand side function \(f^I\) with the same \((t,y)\) arguments. Thus, cfn can omit any communication done by \(f^I\) if relevant to the evaluation of glocal. If all necessary communication was done in \(f^I\), then cfn =
NULL
can be passed in the call toARKBBDPrecInit()
(see below).
2.4.6.2.2. ARKBBDPRE usage
In addition to the header files required for the integration of the
ODE problem (see §2.4.1), to use the
ARKBBDPRE module, the user’s program must include the header file
arkode_bbdpre.h
which declares the needed function prototypes.
The following is a summary of the proper usage of this module. Steps that are unchanged from the skeleton program presented in §2.4.2 are italicized.
Initialize MPI
Create the SUNDIALS simulation context object
Set problem dimensions
Set vector of initial values
Create ARKODE object
Specify integration tolerances
Create iterative linear solver object
When creating the iterative linear solver object, specify the type of preconditioning (
SUN_PREC_LEFT
orSUN_PREC_RIGHT
) to use.Set linear solver optional inputs
Attach linear solver module
Initialize the ARKBBDPRE preconditioner module
Specify the upper and lower halfbandwidths for computation
mudq
andmldq
, the upper and lower halfbandwidths for storagemukeep
andmlkeep
, and callier = ARKBBDPrecInit(arkode_mem, Nlocal, mudq, mldq, mukeep, mlkeep, dqrely, gloc, cfn);
to allocate memory and initialize the internal preconditioner data. The last two arguments of
ARKBBDPrecInit()
are the two usersupplied functions of typeARKLocalFn()
andARKCommFn()
described above, respectively.Create nonlinear solver object
Attach nonlinear solver module
Set nonlinear solver optional inputs
Set optional inputs
Note that the user should not call
ARKodeSetPreconditioner()
as it will overwrite the preconditioner setup and solve functions.Specify rootfinding problem
Advance solution in time
Get optional outputs
Additional optional outputs associated with ARKBBDPRE are available through the routines
ARKBBDPrecGetWorkSpace()
andARKBBDPrecGetNumGfnEvals()
.Deallocate memory for solution vector
Free solver memory
Free linear solver memory
Free nonlinear solver memory
Finalize MPI
2.4.6.2.3. ARKBBDPRE usercallable functions
The ARKBBDPRE preconditioner module is initialized (or reinitialized) and attached to the integrator by calling the following functions:

int ARKBBDPrecInit(void *arkode_mem, sunindextype Nlocal, sunindextype mudq, sunindextype mldq, sunindextype mukeep, sunindextype mlkeep, sunrealtype dqrely, ARKLocalFn gloc, ARKCommFn cfn)
Initializes and allocates (internal) memory for the ARKBBDPRE preconditioner.
 Parameters:
arkode_mem – pointer to the ARKODE memory block.
Nlocal – local vector length.
mudq – upper halfbandwidth to be used in the difference quotient Jacobian approximation.
mldq – lower halfbandwidth to be used in the difference quotient Jacobian approximation.
mukeep – upper halfbandwidth of the retained banded approximate Jacobian block.
mlkeep – lower halfbandwidth of the retained banded approximate Jacobian block.
dqrely – the relative increment in components of y used in the difference quotient approximations. The default is dqrely = \(\sqrt{\text{unit roundoff}}\), which can be specified by passing dqrely = 0.0.
gloc – the name of the C function (of type
ARKLocalFn()
) which computes the approximation \(g(t,y) \approx f^I(t,y)\).cfn – the name of the C function (of type
ARKCommFn()
) which performs all interprocess communication required for the computation of \(g(t,y)\).
 Return values:
ARKLS_SUCCESS – the function exited successfully.
ARKLS_MEM_NULL –
arkode_mem
wasNULL
.ARKLS_LMEM_NULL – the linear solver memory was
NULL
.ARKLS_ILL_INPUT – an input had an illegal value.
ARKLS_MEM_FAIL – a memory allocation request failed.
Note
If one of the halfbandwidths mudq or mldq to be used in the difference quotient calculation of the approximate Jacobian is negative or exceeds the value Nlocal1, it is replaced by 0 or Nlocal1 accordingly.
The halfbandwidths mudq and mldq need not be the true halfbandwidths of the Jacobian of the local block of \(g\) when smaller values may provide a greater efficiency.
Also, the halfbandwidths mukeep and mlkeep of the retained banded approximate Jacobian block may be even smaller than mudq and mldq, to reduce storage and computational costs further.
For all four halfbandwidths, the values need not be the same on every processor.
The ARKBBDPRE module also provides a reinitialization function to
allow solving a sequence of problems of the same size, with the same
linear solver choice, provided there is no change in Nlocal,
mukeep, or mlkeep. After solving one problem, and after
calling *StepReInit
to reinitialize ARKODE for a
subsequent problem, a call to ARKBBDPrecReInit()
can be made
to change any of the following: the halfbandwidths mudq and
mldq used in the differencequotient Jacobian approximations, the
relative increment dqrely, or one of the usersupplied functions
gloc and cfn. If there is a change in any of the linear solver
inputs, an additional call to the “Set” routines provided by the
SUNLINSOL module, and/or one or more of the corresponding
ARKodeSet***
functions, must also be made (in the proper order).

int ARKBBDPrecReInit(void *arkode_mem, sunindextype mudq, sunindextype mldq, sunrealtype dqrely)
Reinitializes the ARKBBDPRE preconditioner module.
 Parameters:
arkode_mem – pointer to the ARKODE memory block.
mudq – upper halfbandwidth to be used in the difference quotient Jacobian approximation.
mldq – lower halfbandwidth to be used in the difference quotient Jacobian approximation.
dqrely – the relative increment in components of y used in the difference quotient approximations. The default is dqrely = \(\sqrt{\text{unit roundoff}}\), which can be specified by passing dqrely = 0.0.
 Return values:
ARKLS_SUCCESS – the function exited successfully.
ARKLS_MEM_NULL –
arkode_mem
wasNULL
.ARKLS_LMEM_NULL – the linear solver memory was
NULL
.ARKLS_PMEM_NULL – the preconditioner memory was
NULL
.
Note
If one of the halfbandwidths mudq or mldq is negative or exceeds the value Nlocal1, it is replaced by 0 or Nlocal1 accordingly.
The following two optional output functions are available for use with the ARKBBDPRE module:

int ARKBBDPrecGetWorkSpace(void *arkode_mem, long int *lenrwBBDP, long int *leniwBBDP)
Returns the processorlocal ARKBBDPRE real and integer workspace sizes.
 Parameters:
arkode_mem – pointer to the ARKODE memory block.
lenrwBBDP – the number of
sunrealtype
values in the ARKBBDPRE workspace.leniwBBDP – the number of integer values in the ARKBBDPRE workspace.
 Return values:
ARKLS_SUCCESS – the function exited successfully.
ARKLS_MEM_NULL –
arkode_mem
wasNULL
.ARKLS_LMEM_NULL – the linear solver memory was
NULL
.ARKLS_PMEM_NULL – the preconditioner memory was
NULL
.
Note
The workspace requirements reported by this routine correspond only to memory allocated within the ARKBBDPRE module (the banded matrix approximation, banded
SUNLinearSolver
object, temporary vectors). These values are local to each process.The workspaces referred to here exist in addition to those given by the corresponding function
ARKodeGetLinWorkSpace()
.

int ARKBBDPrecGetNumGfnEvals(void *arkode_mem, long int *ngevalsBBDP)
Returns the number of calls made to the usersupplied gloc function (of type
ARKLocalFn()
) due to the finite difference approximation of the Jacobian blocks used within the preconditioner setup function. Parameters:
arkode_mem – pointer to the ARKODE memory block.
ngevalsBBDP – the number of calls made to the usersupplied gloc function.
 Return values:
ARKLS_SUCCESS – the function exited successfully.
ARKLS_MEM_NULL –
arkode_mem
wasNULL
.ARKLS_LMEM_NULL – the linear solver memory was
NULL
.ARKLS_PMEM_NULL – the preconditioner memory was
NULL
.
In addition to the ngevalsBBDP gloc evaluations, the costs associated with ARKBBDPRE also include nlinsetups LU factorizations, nlinsetups calls to cfn, npsolves banded backsolve calls, and nfevalsLS righthand side function evaluations, where nlinsetups is an optional ARKODE output and npsolves and nfevalsLS are linear solver optional outputs (see the table §2.4.3.10.4).