-In Listing~\ref{ch2:lst:ex3}, in the CPU computation, this part of code is
-performed using 3 loops, one for $i$, one for $j$ and one for $k$. In order to
-perform the same computation on a GPU, a naive solution consists in considering
-that the matrix $C$ is split into 2 dimensional blocks. The size of each block
-must be chosen such as the number of threads per block is inferior to $1,024$.
+In Listing~\ref{ch2:lst:ex3}, the CPU computation is performed using 3 loops,
+one for $i$, one for $j$ and one for $k$. In order to perform the same
+computation on a GPU, a naive solution consists in considering that the matrix
+$C$ is split into 2 dimensional blocks. The size of each block must be chosen
+such as the number of threads per block is inferior to $1,024$.