${\cal M}_{ij}f_j$ is used to transform the distributions into the
hydrodynamic quantities, where ${\cal M}_{ij}$ is a constant 19x19
matrix related to the choice of
-$\mathbf{c}_i$. The non-conserved hydrodynamic quantities are then
+$\mathbf{c}_i$. The nonconserved hydrodynamic quantities are then
relaxed toward their (known) equilibrium values and are transformed
back to new post-collision distributions via the inverse transformation
${\cal M}^{-1}_{ij}$. This gives rise to the need for a minimum of $2\times 19^2$
possible. For each data structure, such as the distribution, a separate
analogue is maintained in both the CPU and GPU memory spaces. However,
the GPU copy does not include the complete CPU structure: in
-particular, non-intrinsic datatypes such as MPI datatypes are not
+particular, nonintrinsic datatypes such as MPI datatypes are not
required on the GPU. Functions to marshal data between CPU and GPU
are provided for each data structure, abstracting the underlying
CUDA implementation. (This reasonably lightweight abstraction layer