${\cal M}_{ij}f_j$ is used to transform the distributions into the
hydrodynamic quantities, where ${\cal M}_{ij}$ is a constant 19x19
matrix related to the choice of
-$\mathbf{c}_i$. The non-conserved hydrodynamic quantities are then
+$\mathbf{c}_i$. The nonconserved hydrodynamic quantities are then
relaxed toward their (known) equilibrium values and are transformed
back to new post-collision distributions via the inverse transformation
${\cal M}^{-1}_{ij}$. This gives rise to the need for a minimum of $2\times 19^2$
version, the necessary transfers are implemented in place using
a vector of MPI datatypes with appropriate stride for each direction.
-
+\clearpage
\section{Single GPU implementation}\label{ch14:sec:singlegpu}
possible. For each data structure, such as the distribution, a separate
analogue is maintained in both the CPU and GPU memory spaces. However,
the GPU copy does not include the complete CPU structure: in
-particular, non-intrinsic datatypes such as MPI datatypes are not
+particular, nonintrinsic datatypes such as MPI datatypes are not
required on the GPU. Functions to marshal data between CPU and GPU
are provided for each data structure, abstracting the underlying
CUDA implementation. (This reasonably lightweight abstraction layer
-
+\clearpage
\section{Summary}
\label{ch14:sec:summary}
% set second argument of \begin to the number of references
% (used to reserve space for the reference number labels box)
+\clearpage
\putbib[Chapters/chapter14/biblio14]
%\begin{thebibliography}{1}