\{\texttt{lilia.ziane\_khoja},~\texttt{raphael.couturier},~\texttt{arnaud.giersch},~\texttt{jacques.bahi}\}\texttt{@univ-fcomte.fr}
}
+\newcommand{\BW}{\mathit{bw}}
\newcommand{\Iter}{\mathit{iter}}
\newcommand{\Max}{\mathit{max}}
\newcommand{\Offset}{\mathit{offset}}
which its local sub-matrix has nonzero values. Consequently, each computing node manages a global
vector composed of a local vector of size $\frac{n}{p}$ and a shared vector of size $S$:
\begin{equation}
- S = bw - \frac{n}{p},
+ S = \BW - \frac{n}{p},
\label{eq:11}
\end{equation}
-where $\frac{n}{p}$ is the size of the local vector and $bw$ is the bandwidth of the local sparse
+where $\frac{n}{p}$ is the size of the local vector and $\BW$ is the bandwidth of the local sparse
sub-matrix which represents the number of columns between the minimum and the maximum column indices
(see Figure~\ref{fig:01}). In order to improve memory accesses, we use the texture memory to
cache elements of the global vector.