# 9.1. Description of the NVECTOR Modules

The SUNDIALS solvers are written in a data-independent manner. They all operate on generic vectors (of type N_Vector) through a set of operations defined by, and specific to, the particular NVECTOR implementation. Users can provide a custom implementation of the NVECTOR module or use one provided within SUNDIALS. The generic operations are described below. In the sections following, the implementations provided with SUNDIALS are described.

The generic N_Vector type is a pointer to a structure that has an implementation-dependent content field containing the description and actual data of the vector, and an ops field pointing to a structure with generic vector operations. The type N_Vector is defined as

typedef struct _generic_N_Vector *N_Vector

and the generic structure is defined as

struct _generic_N_Vector {
void *content;
struct _generic_N_Vector_Ops *ops;
};


Here, the _generic_N_Vector_Op structure is essentially a list of function pointers to the various actual vector operations, and is defined as

struct _generic_N_Vector_Ops {
N_Vector_ID  (*nvgetvectorid)(N_Vector);
N_Vector     (*nvclone)(N_Vector);
N_Vector     (*nvcloneempty)(N_Vector);
void         (*nvdestroy)(N_Vector);
void         (*nvspace)(N_Vector, sunindextype *, sunindextype *);
realtype*    (*nvgetarraypointer)(N_Vector);
realtype*    (*nvgetdevicearraypointer)(N_Vector);
void         (*nvsetarraypointer)(realtype *, N_Vector);
void*        (*nvgetcommunicator)(N_Vector);
sunindextype (*nvgetlength)(N_Vector);
sunindextype (*nvgetlocallength)(N_Vector);
void         (*nvlinearsum)(realtype, N_Vector, realtype, N_Vector, N_Vector);
void         (*nvconst)(realtype, N_Vector);
void         (*nvprod)(N_Vector, N_Vector, N_Vector);
void         (*nvdiv)(N_Vector, N_Vector, N_Vector);
void         (*nvscale)(realtype, N_Vector, N_Vector);
void         (*nvabs)(N_Vector, N_Vector);
void         (*nvinv)(N_Vector, N_Vector);
realtype     (*nvdotprod)(N_Vector, N_Vector);
realtype     (*nvmaxnorm)(N_Vector);
realtype     (*nvwrmsnorm)(N_Vector, N_Vector);
realtype     (*nvmin)(N_Vector);
realtype     (*nvwl2norm)(N_Vector, N_Vector);
realtype     (*nvl1norm)(N_Vector);
void         (*nvcompare)(realtype, N_Vector, N_Vector);
booleantype  (*nvinvtest)(N_Vector, N_Vector);
realtype     (*nvminquotient)(N_Vector, N_Vector);
int          (*nvlinearcombination)(int, realtype *, N_Vector *, N_Vector);
int          (*nvscaleaddmulti)(int, realtype *, N_Vector, N_Vector *, N_Vector *);
int          (*nvdotprodmulti)(int, N_Vector, N_Vector *,  realtype *);
int          (*nvlinearsumvectorarray)(int, realtype, N_Vector *, realtype,
N_Vector *, N_Vector *);
int          (*nvscalevectorarray)(int, realtype *,  N_Vector *, N_Vector *);
int          (*nvconstvectorarray)(int, realtype, N_Vector *);
int          (*nvwrmsnomrvectorarray)(int, N_Vector *, N_Vector *, realtype *);
int          (*nvwrmsnomrmaskvectorarray)(int, N_Vector *, N_Vector *, N_Vector,
realtype *);
int          (*nvscaleaddmultivectorarray)(int, int, realtype *, N_Vector *,
N_Vector **, N_Vector **);
int          (*nvlinearcombinationvectorarray)(int, int, realtype *, N_Vector **,
N_Vector *);
realtype     (*nvdotprodlocal)(N_Vector, N_Vector);
realtype     (*nvmaxnormlocal)(N_Vector);
realtype     (*nvminlocal)(N_Vector);
realtype     (*nvl1normlocal)(N_Vector);
booleantype  (*nvinvtestlocal)(N_Vector, N_Vector);
realtype     (*nvminquotientlocal)(N_Vector, N_Vector);
realtype     (*nvwsqrsumlocal)(N_Vector, N_Vector);
int          (*nvdotprodmultilocal)(int, N_Vector, N_Vector *, realtype *);
int          (*nvdotprodmultiallreduce)(int, N_Vector, realtype *);
int          (*nvbufsize)(N_Vector, sunindextype *);
int          (*nvbufpack)(N_Vector, void*);
int          (*nvbufunpack)(N_Vector, void*);
};


The generic NVECTOR module defines and implements the vector operations acting on a N_Vector. These routines are nothing but wrappers for the vector operations defined by a particular NVECTOR implementation, which are accessed through the ops field of the N_Vector structure. To illustrate this point we show below the implementation of a typical vector operation from the generic NVECTOR module, namely N_VScale, which performs the operation $$z\gets cx$$ for vectors $$x$$ and $$z$$ and a scalar $$c$$:

void N_VScale(realtype c, N_Vector x, N_Vector z) {
z->ops->nvscale(c, x, z);
}


§9.2 contains a complete list of all standard vector operations defined by the generic NVECTOR module. §9.2.2, §9.2.3, §9.2.4, §9.2.5, and §9.2.6 list optional fused, vector array, local reduction, single buffer reduction, and exchange operations, respectively.

Fused and vector array operations (see §9.2.2 and §9.2.3) are intended to increase data reuse, reduce parallel communication on distributed memory systems, and lower the number of kernel launches on systems with accelerators. If a particular NVECTOR implementation defines a fused or vector array operation as NULL, the generic NVECTOR module will automatically call standard vector operations as necessary to complete the desired operation. In all SUNDIALS-provided NVECTOR implementations, all fused and vector array operations are disabled by default. However, these implementations provide additional user-callable functions to enable/disable any or all of the fused and vector array operations. See the following sections for the implementation specific functions to enable/disable operations.

Local reduction operations (see §9.2.4) are similarly intended to reduce parallel communication on distributed memory systems, particularly when NVECTOR objects are combined together within an NVECTOR_MANYVECTOR object (see §9.22). If a particular NVECTOR implementation defines a local reduction operation as NULL, the NVECTOR_MANYVECTOR module will automatically call standard vector reduction operations as necessary to complete the desired operation. All SUNDIALS-provided NVECTOR implementations include these local reduction operations, which may be used as templates for user-defined implementations.

The single buffer reduction operations (§9.2.5) are used in low-synchronization methods to combine separate reductions into one MPI_Allreduce call.

The exchange operations (see §9.2.6) are intended only for use with the XBraid library for parallel-in-time integration (accessible from ARKODE) and are otherwise unused by SUNDIALS packages.

## 9.1.1. NVECTOR Utility Functions

The generic NVECTOR module also defines several utility functions to aid in creation and management of arrays of N_Vector objects – these functions are particularly useful for Fortran users to utilize the NVECTOR_MANYVECTOR or SUNDIALS’ sensitivity-enabled packages CVODES and IDAS.

The functions N_VCloneVectorArray() and N_VCloneVectorArrayEmpty() create (by cloning) an array of count variables of type N_Vector, each of the same type as an existing N_Vector input:

N_Vector *N_VCloneVectorArray(int count, N_Vector w)

Clones an array of count N_Vector objects, allocating their data arrays (similar to N_VClone()).

Arguments:
• count – number of N_Vector objects to create.

• w – template N_Vector to clone.

Return value:
• pointer to a new N_Vector array on success.

• NULL pointer on failure.

N_Vector *N_VCloneVectorArrayEmpty(int count, N_Vector w)

Clones an array of count N_Vector objects, leaving their data arrays unallocated (similar to N_VCloneEmpty()).

Arguments:
• count – number of N_Vector objects to create.

• w – template N_Vector to clone.

Return value:
• pointer to a new N_Vector array on success.

• NULL pointer on failure.

An array of variables of type N_Vector can be destroyed by calling N_VDestroyVectorArray():

void N_VDestroyVectorArray(N_Vector *vs, int count)

Destroys an array of count N_Vector objects.

Arguments:
• vsN_Vector array to destroy.

• count – number of N_Vector objects in vs array.

Notes:

This routine will internally call the N_Vector implementation-specific N_VDestroy() operation.

If vs was allocated using N_VCloneVectorArray() then the data arrays for each N_Vector object will be freed; if vs was allocated using N_VCloneVectorArrayEmpty() then it is the user’s responsibility to free the data for each N_Vector object.

Finally, we note that users of the Fortran 2003 interface may be interested in the additional utility functions N_VNewVectorArray(), N_VGetVecAtIndexVectorArray(), and N_VSetVecAtIndexVectorArray(), that are wrapped as FN_NewVectorArray, FN_VGetVecAtIndexVectorArray, and FN_VSetVecAtIndexVectorArray, respectively. These functions allow a Fortran 2003 user to create an empty vector array, access a vector from this array, and set a vector within this array:

N_Vector *N_VNewVectorArray(int count)

Creates an array of count N_Vector objects, the pointers to each are initialized as NULL.

Arguments:
• count – length of desired N_Vector array.

Return value:
• pointer to a new N_Vector array on success.

• NULL pointer on failure.

N_Vector *N_VGetVecAtIndexVectorArray(N_Vector *vs, int index)

Accesses the N_Vector at the location index within the N_Vector array vs.

Arguments:
• vsN_Vector array.

• index – desired N_Vector to access from within vs.

Return value:
• pointer to the indexed N_Vector on success.

• NULL pointer on failure (index < 0 or vs == NULL).

Notes:

This routine does not verify that index is within the extent of vs, since vs is a simple N_Vector array that does not internally store its allocated length.

void N_VSetVecAtIndexVectorArray(N_Vector *vs, int index, N_Vector w)

Sets a pointer to w at the location index within the vector array vs.

Arguments:
• vsN_Vector array.

• index – desired location to place the pointer to w within vs.

• wN_Vector to set within vs.

Notes:

This routine does not verify that index is within the extent of vs, since vs is a simple N_Vector array that does not internally store its allocated length.

## 9.1.2. Implementing a custom NVECTOR

A particular implementation of the NVECTOR module must:

• Specify the content field of the N_Vector structure.

• Define and implement the vector operations. Note that the names of these routines should be unique to that implementation in order to permit using more than one NVECTOR module (each with different N_Vector internal data representations) in the same code.

• Define and implement user-callable constructor and destructor routines to create and free an N_Vector with the new content field and with ops pointing to the new vector operations.

• Optionally, define and implement additional user-callable routines acting on the newly-defined N_Vector (e.g., a routine to print the content for debugging purposes).

• Optionally, provide accessor macros as needed for that particular implementation to be used to access different parts in the content field of the newly-defined N_Vector.

To aid in the creation of custom NVECTOR modules, the generic NVECTOR module provides two utility functions N_VNewEmpty() and N_VCopyOps(). When used in custom NVECTOR constructors and clone routines these functions will ease the introduction of any new optional vector operations to the NVECTOR API by ensuring that only required operations need to be set, and that all operations are copied when cloning a vector.

N_Vector N_VNewEmpty()

This allocates a new generic N_Vector object and initializes its content pointer and the function pointers in the operations structure to NULL.

Return value: If successful, this function returns an N_Vector object. If an error occurs when allocating the object, then this routine will return NULL.

void N_VFreeEmpty(N_Vector v)

This routine frees the generic N_Vector object, under the assumption that any implementation-specific data that was allocated within the underlying content structure has already been freed. It will additionally test whether the ops pointer is NULL, and, if it is not, it will free it as well.

Arguments:
• v – an N_Vector object

int N_VCopyOps(N_Vector w, N_Vector v)

This function copies the function pointers in the ops structure of w into the ops structure of v.

Arguments:
• w – the vector to copy operations from

• v – the vector to copy operations to

Return value: If successful, this function returns 0. If either of the inputs are NULL or the ops structure of either input is NULL, then is function returns a non-zero value.

Each NVECTOR implementation included in SUNDIALS has a unique identifier specified in enumeration and shown in Table 9.1. It is recommended that a user supplied NVECTOR implementation use the SUNDIALS_NVEC_CUSTOM identifier.

Table 9.1 Vector Identifications associated with vector kernels supplied with SUNDIALS

Vector ID

Vector type

ID Value

SUNDIALS_NVEC_SERIAL

Serial

0

SUNDIALS_NVEC_PARALLEL

Distributed memory parallel (MPI)

1

SUNDIALS_NVEC_OPENMP

OpenMP shared memory parallel

2

3

SUNDIALS_NVEC_PARHYP

hypre ParHyp parallel vector

4

SUNDIALS_NVEC_PETSC

PETSc parallel vector

5

SUNDIALS_NVEC_CUDA

CUDA vector

6

SUNDIALS_NVEC_HIP

HIP vector

7

SUNDIALS_NVEC_SYCL

SYCL vector

8

SUNDIALS_NVEC_RAJA

RAJA vector

9

SUNDIALS_NVEC_OPENMPDEV

10

SUNDIALS_NVEC_TRILINOS

Trilinos Tpetra vector

11

SUNDIALS_NVEC_MANYVECTOR

“ManyVector” vector

12

SUNDIALS_NVEC_MPIMANYVECTOR

MPI-enabled “ManyVector” vector

13

SUNDIALS_NVEC_MPIPLUSX

MPI+X vector

14

SUNDIALS_NVEC_CUSTOM

User-provided custom vector

15

## 9.1.3. Support for complex-valued vectors

While SUNDIALS itself is written under an assumption of real-valued data, it does provide limited support for complex-valued problems. However, since none of the built-in NVECTOR modules supports complex-valued data, users must provide a custom NVECTOR implementation for this task. Many of the NVECTOR routines described in the subsection §9.2 naturally extend to complex-valued vectors; however, some do not. To this end, we provide the following guidance:

While many SUNDIALS solver modules may be utilized on complex-valued data, others cannot. Specifically, although each package’s linear solver interface (e.g., ARKLS or CVLS) may be used on complex-valued problems, none of the built-in SUNMatrix or SUNLinearSolver modules will work (all of the direct linear solvers must store complex-valued data, and all of the iterative linear solvers require N_VDotProd()). Hence a complex-valued user must provide custom linear solver modules for their problem. At a minimum this will consist of a custom SUNLinearSolver implementation (see §11.1.8), and optionally a custom SUNMatrix as well. The user should then attach these modules as normal to the package’s linear solver interface.

Similarly, although both the SUNNonlinearSolver_Newton and SUNNonlinearSolver_FixedPoint modules may be used with any of the IVP solvers (CVODE(S), IDA(S) and ARKODE) for complex-valued problems, the Anderson-acceleration option with SUNNonlinearSolver_FixedPoint cannot be used due to its reliance on N_VDotProd(). By this same logic, the Anderson acceleration feature within KINSOL will also not work with complex-valued vectors.

Finally, constraint-handling features of each package cannot be used for complex-valued data, due to the issue of ordering in the complex plane discussed above with N_VCompare(), N_VConstrMask(), N_VMinQuotient(), N_VConstrMaskLocal() and N_VMinQuotientLocal().

We provide a simple example of a complex-valued example problem, including a custom complex-valued Fortran 2003 NVECTOR module, in the files examples/arkode/F2003_custom/ark_analytic_complex_f2003.f90, examples/arkode/F2003_custom/fnvector_complex_mod.f90, and examples/arkode/F2003_custom/test_fnvector_complex_mod.f90.

# 9.2. Description of the NVECTOR operations

## 9.2.1. Standard vector operations

The standard vector operations defined by the generic N_Vector module are defined as follows. For each of these operations, we give the name, usage of the function, and a description of its mathematical operations below.

N_Vector_ID N_VGetVectorID(N_Vector w)

Returns the vector type identifier for the vector w. It is used to determine the vector implementation type (e.g. serial, parallel, …) from the abstract N_Vector interface. Returned values are given in Table 9.1.

Usage:

id = N_VGetVectorID(w);

N_Vector N_VClone(N_Vector w)

Creates a new N_Vector of the same type as an existing vector w and sets the ops field. It does not copy the vector, but rather allocates storage for the new vector.

Usage:

v = N_VClone(w);

N_Vector N_VCloneEmpty(N_Vector w)

Creates a new N_Vector of the same type as an existing vector w and sets the ops field. It does not allocate storage for the new vector’s data.

Usage:

v = N VCloneEmpty(w);

void N_VDestroy(N_Vector v)

Destroys the N_Vector v and frees memory allocated for its internal data.

Usage:

N_VDestroy(v);

void N_VSpace(N_Vector v, sunindextype *lrw, sunindextype *liw)

Returns storage requirements for the N_Vector v:

• lrw contains the number of realtype words

• liw contains the number of integer words.

This function is advisory only, for use in determining a user’s total space requirements; it could be a dummy function in a user-supplied NVECTOR module if that information is not of interest.

Usage:

N_VSpace(nvSpec, &lrw, &liw);

realtype *N_VGetArrayPointer(N_Vector v)

Returns a pointer to a realtype array from the N_Vector v. Note that this assumes that the internal data in the N_Vector is a contiguous array of realtype and is accesible from the CPU.

This routine is only used in the solver-specific interfaces to the dense and banded (serial) linear solvers, and in the interfaces to the banded (serial) and band-block-diagonal (parallel) preconditioner modules provided with SUNDIALS.

Usage:

vdata = N_VGetArrayPointer(v);

realtype *N_VGetDeviceArrayPointer(N_Vector v)

Returns a device pointer to a realtype array from the N_Vector v. Note that this assumes that the internal data in N_Vector is a contiguous array of realtype and is accessible from the device (e.g., GPU).

This operation is optional except when using the GPU-enabled direct linear solvers.

Usage:

vdata = N_VGetArrayPointer(v);

void N_VSetArrayPointer(realtype *vdata, N_Vector v)

Replaces the data array pointer in an N_Vector with a given array of realtype. Note that this assumes that the internal data in the N_Vector is a contiguous array of realtype. This routine is only used in the interfaces to the dense (serial) linear solver, hence need not exist in a user-supplied NVECTOR module.

Usage:

N_VSetArrayPointer(vdata,v);

void *N_VGetCommunicator(N_Vector v)

Returns a pointer to the MPI_Comm object associated with the vector (if applicable). For MPI-unaware vector implementations, this should return NULL.

Usage:

commptr = N_VGetCommunicator(v);

sunindextype N_VGetLength(N_Vector v)

Returns the global length (number of “active” entries) in the NVECTOR v. This value should be cumulative across all processes if the vector is used in a parallel environment. If v contains additional storage, e.g., for parallel communication, those entries should not be included.

Usage:

global_length = N_VGetLength(v);

sunindextype N_VGetLocalLength(N_Vector v)

Returns the local length (number of “active” entries) in the NVECTOR v. This value should be the length of the array returned by N_VGetArrayPointer() or N_VGetDeviceArrayPointer().

Usage:

local_length = N_VGetLocalLength(v);

void N_VLinearSum(realtype a, N_Vector x, realtype b, N_Vector y, N_Vector z)

Performs the operation z = ax + by, where a and b are realtype scalars and x and y are of type N_Vector:

$z_i = a x_i + b y_i, \quad i=0,\ldots,n-1.$

The output vector z can be the same as either of the input vectors (x or y).

Usage:

N_VLinearSum(a, x, b, y, z);

void N_VConst(realtype c, N_Vector z)

Sets all components of the N_Vector z to realtype c:

$z_i = c, \quad i=0,\ldots,n-1.$

Usage:

N_VConst(c, z);

void N_VProd(N_Vector x, N_Vector y, N_Vector z)

Sets the N_Vector z to be the component-wise product of the N_Vector inputs x and y:

$z_i = x_i y_i, \quad i=0,\ldots,n-1.$

Usage:

N_VProd(x, y, z);

void N_VDiv(N_Vector x, N_Vector y, N_Vector z)

Sets the N_Vector z to be the component-wise ratio of the N_Vector inputs x and y:

$z_i = \frac{x_i}{y_i}, \quad i=0,\ldots,n-1.$

The $$y_i$$ may not be tested for 0 values. It should only be called with a y that is guaranteed to have all nonzero components.

Usage:

N_VDiv(x, y, z);

void N_VScale(realtype c, N_Vector x, N_Vector z)

Scales the N_Vector x by the realtype scalar c and returns the result in z:

$z_i = c x_i, \quad i=0,\ldots,n-1.$

Usage:

N_VScale(c, x, z);

void N_VAbs(N_Vector x, N_Vector z)

Sets the components of the N_Vector z to be the absolute values of the components of the N_Vector x:

$z_i = |x_i|, \quad i=0,\ldots,n-1.$

Usage:

N_VAbs(x, z);

void N_VInv(N_Vector x, N_Vector z)

Sets the components of the N_Vector z to be the inverses of the components of the N_Vector x:

$z_i = \frac{1}{x_i}, \quad i=0,\ldots,n-1.$

This routine may not check for division by 0. It should be called only with an x which is guaranteed to have all nonzero components.

Usage:

N_VInv(x, z);

void N_VAddConst(N_Vector x, realtype b, N_Vector z)

Adds the realtype scalar b to all components of x and returns the result in the N_Vector z:

$z_i = x_i+b, \quad i=0,\ldots,n-1.$

Usage:

N_VAddConst(x, b, z);

realtype N_VDotProd(N_Vector x, N_Vector z)

Returns the value of the dot-product of the N_Vectors x and y:

$d = \sum_{i=0}^{n-1} x_i y_i.$

Usage:

d = N_VDotProd(x, y);

realtype N_VMaxNorm(N_Vector x)

Returns the value of the $$l_{\infty}$$ norm of the N_Vector x:

$m = \max_{0\le i< n} |x_i|.$

Usage:

m = N_VMaxNorm(x);

realtype N_VWrmsNorm(N_Vector x, N_Vector w)

Returns the weighted root-mean-square norm of the N_Vector x with (positive) realtype weight vector w:

$m = \sqrt{\left( \sum_{i=0}^{n-1} (x_i w_i)^2 \right) / n}$

Usage:

m = N_VWrmsNorm(x, w);

realtype N_VWrmsNormMask(N_Vector x, N_Vector w, N_Vector id)

Returns the weighted root mean square norm of the N_Vector x with realtype weight vector w built using only the elements of x corresponding to positive elements of the N_Vector id:

$m = \sqrt{\left( \sum_{i=0}^{n-1} (x_i w_i H(id_i))^2 \right) / n},$

where $$H(\alpha)=\begin{cases} 1 & \alpha>0\\ 0 & \alpha \leq 0\end{cases}$$.

Usage:

m = N_VWrmsNormMask(x, w, id);

realtype N_VMin(N_Vector x)

Returns the smallest element of the N_Vector x:

$m = \min_{0\le i< n} x_i.$

Usage:

m = N_VMin(x);

realtype N_VWl2Norm(N_Vector x, N_Vector w)

Returns the weighted Euclidean $$l_2$$ norm of the N_Vector x with realtype weight vector w:

$m = \sqrt{\sum_{i=0}^{n-1}\left(x_i w_i\right)^2}.$

Usage:

m = N_VWL2Norm(x, w);

realtype N_VL1Norm(N_Vector x)

Returns the $$l_1$$ norm of the N_Vector x:

$m = \sum_{i=0}^{n-1} |x_i|.$

Usage:

m = N_VL1Norm(x);

void N_VCompare(realtype c, N_Vector x, N_Vector z)

Compares the components of the N_Vector x to the realtype scalar c and returns an N_Vector z such that for all $$0\le i< n$$,

$\begin{split}z_i = \begin{cases} 1.0 &\quad\text{if}\; |x_i| \ge c,\\ 0.0 &\quad\text{otherwise}\end{cases}.\end{split}$

Usage:

N_VCompare(c, x, z);

booleantype N_VInvTest(N_Vector x, N_Vector z)

Sets the components of the N_Vector z to be the inverses of the components of the N_Vector x, with prior testing for zero values:

$z_i = \frac{1}{x_i}, \quad i=0,\ldots,n-1.$

This routine returns a boolean assigned to SUNTRUE if all components of x are nonzero (successful inversion) and returns SUNFALSE otherwise.

Usage:

t = N_VInvTest(x, z);

booleantype N_VConstrMask(N_Vector c, N_Vector x, N_Vector m)

Performs the following constraint tests based on the values in $$c_i$$:

$\begin{split}\begin{array}{rllll} x_i &>& 0 \;&\text{if}\; &c_i = 2, \\ x_i &\ge& 0 \;&\text{if}\; &c_i = 1, \\ x_i &<& 0 \;&\text{if}\; &c_i = -2, \\ x_i &\le& 0 \;&\text{if}\; &c_i = -1. \end{array}\end{split}$

There is no constraint on $$x_i$$ if $$c_i = 0$$. This routine returns a boolean assigned to SUNFALSE if any element failed the constraint test and assigned to SUNTRUE if all passed. It also sets a mask vector m, with elements equal to 1.0 where the constraint test failed, and 0.0 where the test passed. This routine is used only for constraint checking.

Usage:

t = N_VConstrMask(c, x, m);

realtype N_VMinQuotient(N_Vector num, N_Vector denom)

This routine returns the minimum of the quotients obtained by termwise dividing the elements of n by the elements in d:

$\min_{0\le i< n} \frac{\text{num}_i}{\text{denom}_i}.$

A zero element in denom will be skipped. If no such quotients are found, then the large value BIG_REAL (defined in the header file sundials_types.h) is returned.

Usage:

minq = N_VMinQuotient(num, denom);


## 9.2.2. Fused operations

The following fused vector operations are optional. These operations are intended to increase data reuse, reduce parallel communication on distributed memory systems, and lower the number of kernel launches on systems with accelerators. If a particular NVECTOR implementation defines one of the fused vector operations as NULL, the NVECTOR interface will call one of the above standard vector operations as necessary. As above, for each operation, we give the name, usage of the function, and a description of its mathematical operations below.

int N_VLinearCombination(int nv, realtype *c, N_Vector *X, N_Vector z)

This routine computes the linear combination of nv vectors with $$n$$ elements:

$z_i = \sum_{j=0}^{nv-1} c_j x_{j,i}, \quad i=0,\ldots,n-1,$

where $$c$$ is an array of $$nv$$ scalars, $$x_j$$ is a vector in the vector array X, and z is the output vector. If the output vector z is one of the vectors in X, then it must be the first vector in the vector array. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VLinearCombination(nv, c, X, z);

int N_VScaleAddMulti(int nv, realtype *c, N_Vector x, N_Vector *Y, N_Vector *Z)

This routine scales and adds one vector to nv vectors with $$n$$ elements:

$z_{j,i} = c_j x_i + y_{j,i}, \quad j=0,\ldots,nv-1 \quad i=0,\ldots,n-1,$

where c is an array of scalars, x is a vector, $$y_j$$ is a vector in the vector array Y, and $$z_j$$ is an output vector in the vector array Z. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VScaleAddMulti(nv, c, x, Y, Z);

int N_VDotProdMulti(int nv, N_Vector x, N_Vector *Y, realtype *d)

This routine computes the dot product of a vector with nv vectors having $$n$$ elements:

$d_j = \sum_{i=0}^{n-1} x_i y_{j,i}, \quad j=0,\ldots,nv-1,$

where d is an array of scalars containing the computed dot products, x is a vector, and $$y_j$$ is a vector the vector array Y. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VDotProdMulti(nv, x, Y, d);


## 9.2.3. Vector array operations

The following vector array operations are also optional. As with the fused vector operations, these are intended to increase data reuse, reduce parallel communication on distributed memory systems, and lower the number of kernel launches on systems with accelerators. If a particular NVECTOR implementation defines one of the fused or vector array operations as NULL, the NVECTOR interface will call one of the above standard vector operations as necessary. As above, for each operation, we give the name, usage of the function, and a description of its mathematical operations below.

int N_VLinearSumVectorArray(int nv, realtype a, N_Vector X, realtype b, N_Vector *Y, N_Vector *Z)

This routine computes the linear sum of two vector arrays of nv vectors with $$n$$ elements:

$z_{j,i} = a x_{j,i} + b y_{j,i}, \quad i=0,\ldots,n-1 \quad j=0,\ldots,nv-1,$

where a and b are scalars, $$x_j$$ and $$y_j$$ are vectors in the vector arrays X and Y respectively, and $$z_j$$ is a vector in the output vector array Z. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VLinearSumVectorArray(nv, a, X, b, Y, Z);

int N_VScaleVectorArray(int nv, realtype *c, N_Vector *X, N_Vector *Z)

This routine scales each element in a vector of $$n$$ elements in a vector array of nv vectors by a potentially different constant:

$z_{j,i} = c_j x_{j,i}, \quad i=0,\ldots,n-1 \quad j=0,\ldots,nv-1,$

where c is an array of scalars, $$x_j$$ is a vector in the vector array X, and $$z_j$$ is a vector in the output vector array Z. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VScaleVectorArray(nv, c, X, Z);

int N_VConstVectorArray(int nv, realtype c, N_Vector *Z)

This routine sets each element in a vector of $$n$$ elements in a vector array of nv vectors to the same value:

$z_{j,i} = c, \quad i=0,\ldots,n-1 \quad j=0,\ldots,nv-1,$

where c is a scalar and $$z_j$$ is a vector in the vector array Z. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VConstVectorArray(nv, c, Z);

int N_VWrmsNormVectorArray(int nv, N_Vector *X, N_Vector *W, realtype *m)

This routine computes the weighted root mean square norm of each vector in a vector array:

$m_j = \left( \frac1n \sum_{i=0}^{n-1} \left(x_{j,i} w_{j,i}\right)^2\right)^{1/2}, \quad j=0,\ldots,nv-1,$

where $$x_j$$ is a vector in the vector array X, $$w_j$$ is a weight vector in the vector array W, and m is the output array of scalars containing the computed norms. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VWrmsNormVectorArray(nv, X, W, m);

int N_VWrmsNormMaskVectorArray(int nv, N_Vector *X, N_Vector *W, N_Vector id, realtype *m)

This routine computes the masked weighted root mean square norm of each vector in a vector array:

$m_j = \left( \frac1n \sum_{i=0}^{n-1} \left(x_{j,i} w_{j,i} H(id_i)\right)^2 \right)^{1/2}, \quad j=0,\ldots,nv-1,$

where $$H(id_i)=1$$ if $$id_i > 0$$ and is zero otherwise, $$x_j$$ is a vector in the vector array X, $$w_j$$ is a weight vector in the vector array W, id is the mask vector, and m is the output array of scalars containing the computed norms. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VWrmsNormMaskVectorArray(nv, X, W, id, m);

int N_VScaleAddMultiVectorArray(int nv, int nsum, realtype *c, N_Vector *X, N_Vector **YY, N_Vector **ZZ)

This routine scales and adds a vector array of nv vectors to nsum other vector arrays:

$z_{k,j,i} = c_k x_{j,i} + y_{k,j,i}, \quad i=0,\ldots,n-1 \quad j=0,\ldots,nv-1, \quad k=0,\ldots,nsum-1$

where c is an array of scalars, $$x_j$$ is a vector in the vector array X, $$y_{k,j}$$ is a vector in the array of vector arrays YY, and $$z_{k,j}$$ is an output vector in the array of vector arrays ZZ. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VScaleAddMultiVectorArray(nv, nsum, c, x, YY, ZZ);

int N_VLinearCombinationVectorArray(int nv, int nsum, realtype *c, N_Vector **XX, N_Vector *Z)

This routine computes the linear combination of nsum vector arrays containing nv vectors:

$z_{j,i} = \sum_{k=0}^{nsum-1} c_k x_{k,j,i}, \quad i=0,\ldots,n-1 \quad j=0,\ldots,nv-1,$

where c is an array of scalars, $$x_{k,j}$$ is a vector in array of vector arrays XX, and $$z_{j,i}$$ is an output vector in the vector array Z. If the output vector array is one of the vector arrays in XX, it must be the first vector array in XX. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VLinearCombinationVectorArray(nv, nsum, c, XX, Z);


## 9.2.4. Local reduction operations

The following local reduction operations are also optional. As with the fused and vector array operations, these are intended to reduce parallel communication on distributed memory systems. If a particular NVECTOR implementation defines one of the local reduction operations as NULL, the NVECTOR interface will call one of the above standard vector operations as necessary. As above, for each operation, we give the name, usage of the function, and a description of its mathematical operations below.

realtype N_VDotProdLocal(N_Vector x, N_Vector y)

This routine computes the MPI task-local portion of the ordinary dot product of x and y:

$d=\sum_{i=0}^{n_{local}-1} x_i y_i,$

where $$n_{local}$$ corresponds to the number of components in the vector on this MPI task (or $$n_{local}=n$$ for MPI-unaware applications).

Usage:

d = N_VDotProdLocal(x, y);

realtype N_VMaxNormLocal(N_Vector x)

This routine computes the MPI task-local portion of the maximum norm of the NVECTOR x:

$m = \max_{0\le i< n_{local}} | x_i |,$

where $$n_{local}$$ corresponds to the number of components in the vector on this MPI task (or $$n_{local}=n$$ for MPI-unaware applications).

Usage:

m = N_VMaxNormLocal(x);

realtype N_VMinLocal(N_Vector x)

This routine computes the smallest element of the MPI task-local portion of the NVECTOR x:

$m = \min_{0\le i< n_{local}} x_i,$

where $$n_{local}$$ corresponds to the number of components in the vector on this MPI task (or $$n_{local}=n$$ for MPI-unaware applications).

Usage:

m = N_VMinLocal(x);

realtype N_VL1NormLocal(N_Vector x)

This routine computes the MPI task-local portion of the $$l_1$$ norm of the N_Vector x:

$n = \sum_{i=0}^{n_{local}-1} | x_i |,$

where $$n_{local}$$ corresponds to the number of components in the vector on this MPI task (or $$n_{local}=n$$ for MPI-unaware applications).

Usage:

n = N_VL1NormLocal(x);

realtype N_VWSqrSumLocal(N_Vector x, N_Vector w)

This routine computes the MPI task-local portion of the weighted squared sum of the NVECTOR x with weight vector w:

$s = \sum_{i=0}^{n_{local}-1} (x_i w_i)^2,$

where $$n_{local}$$ corresponds to the number of components in the vector on this MPI task (or $$n_{local}=n$$ for MPI-unaware applications).

Usage:

s = N_VWSqrSumLocal(x, w);

realtype N_VWSqrSumMaskLocal(N_Vector x, N_Vector w, N_Vector id)

This routine computes the MPI task-local portion of the weighted squared sum of the NVECTOR x with weight vector w built using only the elements of x corresponding to positive elements of the NVECTOR id:

$m = \sum_{i=0}^{n_{local}-1} (x_i w_i H(id_i))^2,$

where

$\begin{split}H(\alpha) = \begin{cases} 1 & \alpha > 0 \\ 0 & \alpha \leq 0 \end{cases}\end{split}$

and $$n_{local}$$ corresponds to the number of components in the vector on this MPI task (or $$n_{local}=n$$ for MPI-unaware applications).

Usage:

s = N_VWSqrSumMaskLocal(x, w, id);

booleantype N_VInvTestLocal(N_Vector x)

This routine sets the MPI task-local components of the NVECTOR z to be the inverses of the components of the NVECTOR x, with prior testing for zero values:

$z_i = \frac{1}{x_i}, \: i=0,\ldots,n_{local}-1$

where $$n_{local}$$ corresponds to the number of components in the vector on this MPI task (or $$n_{local}=n$$ for MPI-unaware applications). This routine returns a boolean assigned to SUNTRUE if all task-local components of x are nonzero (successful inversion) and returns SUNFALSE otherwise.

Usage:

t = N_VInvTestLocal(x);

booleantype N_VConstrMaskLocal(N_Vector c, N_Vector x, N_Vector m)

Performs the following constraint tests based on the values in $$c_i$$:

$\begin{split}\begin{array}{rllll} x_i &>& 0 \;&\text{if}\; &c_i = 2, \\ x_i &\ge& 0 \;&\text{if}\; &c_i = 1, \\ x_i &<& 0 \;&\text{if}\; &c_i = -2, \\ x_i &\le& 0 \;&\text{if}\; &c_i = -1. \end{array}\end{split}$

for all MPI task-local components of the vectors. This routine returns a boolean assigned to SUNFALSE if any task-local element failed the constraint test and assigned to SUNTRUE if all passed. It also sets a mask vector m, with elements equal to 1.0 where the constraint test failed, and 0.0 where the test passed. This routine is used only for constraint checking.

Usage:

t = N_VConstrMaskLocal(c, x, m);

realtype N_VMinQuotientLocal(N_Vector num, N_Vector denom)

This routine returns the minimum of the quotients obtained by term-wise dividing $$num_i$$ by $$denom_i$$, for all MPI task-local components of the vectors. A zero element in denom will be skipped. If no such quotients are found, then the large value BIG_REAL (defined in the header file sundials_types.h) is returned.

Usage:

minq = N_VMinQuotientLocal(num, denom);


## 9.2.5. Single Buffer Reduction Operations

The following optional operations are used to combine separate reductions into a single MPI call by splitting the local computation and communication into separate functions. These operations are used in low-synchronization orthogonalization methods to reduce the number of MPI Allreduce calls. If a particular NVECTOR implementation does not define these operations additional communication will be required.

int N_VDotProdMultiLocal(int nv, N_Vector x, N_Vector *Y, realtype *d)

This routine computes the MPI task-local portion of the dot product of a vector $$x$$ with nv vectors $$y_j$$:

$d_j = \sum_{i=0}^{n_{local}-1} x_i y_{j,i}, \quad j=0,\ldots,nv-1,$

where $$d$$ is an array of scalars containing the computed dot products, $$x$$ is a vector, $$y_j$$ is a vector in the vector array Y, and $$n_{local}$$ corresponds to the number of components in the vector on this MPI task. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VDotProdMultiLocal(nv, x, Y, d);

int N_VDotProdMultiAllReduce(int nv, N_Vector x, realtype *d)

This routine combines the MPI task-local portions of the dot product of a vector $$x$$ with nv vectors:

retval = MPI_Allreduce(MPI_IN_PLACE, d, nv, MPI_SUNREALTYPE, MPI_SUM, comm)


where d is an array of nv scalars containing the local contributions to the dot product and comm is the MPI communicator associated with the vector x. The operation returns 0 for success and a non-zero value otherwise.

Usage:

retval = N_VDotProdMultiAllReduce(nv, x, d);


## 9.2.6. Exchange operations

The following vector exchange operations are also optional and are intended only for use when interfacing with the XBraid library for parallel-in-time integration. In that setting these operations are required but are otherwise unused by SUNDIALS packages and may be set to NULL. For each operation, we give the function signature, a description of the expected behavior, and an example of the function usage.

int N_VBufSize(N_Vector x, sunindextype *size)

This routine returns the buffer size need to exchange in the data in the vector x between computational nodes.

Usage:

flag = N_VBufSize(x, &buf_size)

int N_VBufPack(N_Vector x, void *buf)

This routine fills the exchange buffer buf with the vector data in x.

Usage:

flag = N_VBufPack(x, &buf)

int N_VBufUnpack(N_Vector x, void *buf)

This routine unpacks the data in the exchange buffer buf into the vector x.

Usage:

flag = N_VBufUnpack(x, buf)