With the right hand side operator in place, we are ready to implement the solver. For this simple PDE problem we compute all necessary initial data in the body of the main function and use the forward Euler time integrator to compute the solution till $t=t_{end}$. For more advanced PDE solvers, a built-in \texttt{ode\_solver} class is defined, that helps taking care of initialization and storage of multiple state variables. Declaring type definitions for all components at the beginning of the main file gives a good overview of the solver composition. In this way, it will be easy to control or change solver components at later times. Listing~\ref{ch5:lst:heattypedefs} lists the type definitions\index{type definitions} that are used to assemble the heat conduction solver. %\stefan{elaborate on the choices}.
-\lstset{label=ch5:lst:heattypedefs,caption={Type definitions for all the heat conduction solver components used throughout the remaining code.}}
-\begin{lstlisting}
+\lstset{caption={Type definitions for all the heat conduction solver components used throughout the remaining code.}}
+\begin{lstlisting}[label=ch5:lst:heattypedefs]
typedef double value_type;
typedef laplacian<value_type> rhs_type;
typedef gpulab::grid<value_type> vector_type;
\end{lstlisting}
The grid is by default treated as a device object and memory is allocated on the GPU upon initialization of the grid. Setting up the grid can be done via the property type class. The property class holds information about the discrete and physical dimensions, along with fictitious ghost (halo) layers and periodicity conditions. For the heat conduction problem we use a non-periodic domain within the unit square and with no ghost layers. Listing \ref{ch5:lst:gridsetup} illustrates the grid assembly.
-\lstset{label=ch5:lst:gridsetup,caption={Create 2D grid of discrete dimension \texttt{N} times \texttt{N} and physical dimension $0$ to $1$.}}
-\begin{lstlisting}
+\lstset{caption={Create 2D grid of discrete dimension \texttt{N} times \texttt{N} and physical dimension $0$ to $1$.}}
+\begin{lstlisting}[label=ch5:lst:gridsetup]
// Setup discrete and physical dimensions
gpulab::grid_dim<int> dim(N,N,1);
gpulab::grid_dim<value_type> p0(0,0);
\end{lstlisting}
Hereafter the vector \texttt{u} can be initialized accordingly to \eqref{ch5:eq:heatinit}. Finally we need to instantiate the right hand side Laplacian operator from Listing \ref{ch5:lst:laplaceimpl} and the forward Euler time integrator in order to integrate from $t_0$ till $t_{end}$.
-\lstset{label=ch5:lst:timeintegrator,caption=Create time integrator and the right hand side Laplacian operator.}
-\begin{lstlisting}
+\lstset{caption=Create time integrator and the right hand side Laplacian operator.}
+\begin{lstlisting}[label=ch5:lst:timeintegrator]
rhs_type rhs(alpha); // Create right hand side operator
time_integrator_type solver; // Create time integrator
solver(&rhs,u,0.0f,tend,dt); // Integrate from to tend using dt
In the following section we demonstrate how to assemble a solver for the discrete Poisson problem, using one of the tree iterative methods to efficiently solve \eqref{ch5:eq:poissonsystem}.
-\lstset{label=ch5:lst:dc,caption={Main loop for the iterative defect correction solver. The solver is instantiated with template argument types for the matrix and vector classes, allowing underlying implementations to be based on GPU kernels.}}
-\begin{lstlisting}
+\lstset{caption={Main loop for the iterative defect correction solver. The solver is instantiated with template argument types for the matrix and vector classes, allowing underlying implementations to be based on GPU kernels.}}
+\begin{lstlisting}[label=ch5:lst:dc]
while(r.nrm2() > tol)
{
// Calculate residual
At the beginning of the solver implementation we list the type definitions for the Poisson solver that will be used throughout the implementation. Here we use a geometric multigrid\index{multigrid} method as a preconditioner for the defect correction method. Therefore the multigrid solver is assembled first, so that it can be used in the assembling of the defect correction method. Listing \ref{ch5:lst:poissontypedefs} defines the types for the vector, the matrix, the multigrid preconditioner and the defect correction solver. The geometric multigrid method needs two additional template arguments, that are specific for multigrid, namely a smoother and a grid restriction/interpolation operator. These arguments are free to be implemented and supplied by the developer if special care is required for their specific problems, e.g. for a custom grid structure. For the Poisson problem on a regular grid, the library contains built-in restriction and interpolation operators, and a red-black Gauss-Seidel smoother. We refer to \cite{ch5:Trottenberg2001} for extensive details on multigrid methods. The monitor and config types that appear in Listing \ref{ch5:lst:poissontypedefs} are used for convergence monitoring within the iterative solver and to control run time parameters, such as tolerances, iteration limits, etc.
-\lstset{label=ch5:lst:poissontypedefs,caption={Type definitions for the Laplacian matrix component and the multigrid preconditioned iterative defect correction\index{defect correction} solver.}}
-\begin{lstlisting}
+\lstset{caption={Type definitions for the Laplacian matrix component and the multigrid preconditioned iterative defect correction\index{defect correction} solver.}}
+\begin{lstlisting}[label=ch5:lst:poissontypedefs]
typedef double value_type;
typedef gpulab::grid<value_type> vector_type;
typedef laplacian<vector_type> matrix_type;
With the type definitions set up, the implementation for the Poisson solver follows in Listing \ref{ch5:lst:poissonsolver}. Some of the initializations are left out, as they follow the same procedure as for the Heat conduction example. The defect correction and geometric multigrid solvers are initiated and then the multigrid is set as a preconditioner to the defect correction method. Finally the system is solved via a call to \texttt{solve()}.
-\lstset{label=ch5:lst:poissonsolver,caption={Initializing the preconditioned defect correction solver to approximate the solution to $\mymat{A}\myvec{u}=\myvec{f}$.}}
-\begin{lstlisting}
+\lstset{caption={Initializing the preconditioned defect correction solver to approximate the solution to $\mymat{A}\myvec{u}=\myvec{f}$.}}
+\begin{lstlisting}[label=ch5:lst:poissonsolver]
matrix_type A(alpha); // High-order matrix
matrix_type M(1); // Low-order matrix
U_{n}^{k+1}=\mathcal{\mathcal{G}}_{\Delta T}\left(U_{n-1}^{k+1}\right)+\mathcal{\mathcal{F}}_{\Delta T}\left(U_{n-1}^{k}\right)-\mathcal{\mathcal{G}}_{\Delta T}\left(U_{n-1}^{k}\right),\quad U_{0}^{k}=u^{0},
\end{equation}
with the initial prediction $U^{0}_{n} = \mathcal{G}_{\Delta T}^{n} u^{0}$ for $n=1\ldots N$ and $k=1\ldots K$. $N$ being the number of time subdomains, while $K\geq1$ is the number of predictor-corrector iterations applied. The parareal algorithm is implemented in the library as a separate time integration component, using a fully distributed work scheduling model, as proposed by Aubanel (2010)~\cite{ch5:EA10}. The model is schematically presented in Figure \ref{ch5:fig:FullyDistributedCores}. The parareal component hides all communication and work distribution from the application developer. It is defined such that a user only has to decide what coarse and fine propagators to use. Setting up the type definitions for parareal time integration using forward Euler for coarse propagation and fourth order Runge-Kutta for fine propagation, could then be defined as follows:
-\lstset{label=ch5:lst:parareal,caption={Assembling a parareal time integrator using forward Euler for coarse propagation and a Runge-Kutta method for fine propagation.}}
-\begin{lstlisting}
+\lstset{caption={Assembling a parareal time integrator using forward Euler for coarse propagation and a Runge-Kutta method for fine propagation.}}
+\begin{lstlisting}[label=ch5:lst:parareal]
typedef gpulab::integration::forward_euler coarse;
typedef gpulab::integration::ERK4 fine;
typedef gpulab::integration::parareal<coarse,fine> integrator;