% affiliations
\author{\IEEEauthorblockN{Rapha\"el Couturier\IEEEauthorrefmark{1}, Lilia Ziane Khodja\IEEEauthorrefmark{2}, and Christophe Guyeux\IEEEauthorrefmark{1}}
-\IEEEauthorblockA{\IEEEauthorrefmark{1} Femto-ST Institute, University of Franche Comte, France\\
+\IEEEauthorblockA{\IEEEauthorrefmark{1} Femto-ST Institute, University of Franche-Comt\'e, France\\
Email: \{raphael.couturier,christophe.guyeux\}@univ-fcomte.fr}
\IEEEauthorblockA{\IEEEauthorrefmark{2} INRIA Bordeaux Sud-Ouest, France\\
Email: lilia.ziane@inria.fr}
However, iterative methods suffer from scalability problems on parallel
computing platforms with many processors, due to their need of reduction
-operations, and to collective communications to achive matrix-vector
+operations, and to collective communications to achieve matrix-vector
multiplications. The communications on large clusters with thousands of cores
and large sizes of messages can significantly affect the performances of these
iterative methods. As a consequence, Krylov subspace iteration methods are often used
At each outer iteration, the sparse linear system $Ax=b$ is partially
solved using only $m$
iterations of an iterative method, this latter being initialized with the
-best known approximation previously obtained.
-GMRES method~\cite{Saad86}, or any of its variants, can be used for instance as an
-inner solver. The current approximation of the Krylov method is then stored inside a matrix
-$S$ composed by the successive solutions that are computed during inner iterations.
+last obtained approximation.
+GMRES method~\cite{Saad86}, or any of its variants, can potentially be used as
+inner solver. The current approximation of the Krylov method is then stored inside a $n \times s$ matrix
+$S$, which is composed by the $s$ last solutions that have been computed during
+the inner iterations phase.
At each $s$ iterations, the minimization step is applied in order to
compute a new solution $x$. For that, the previous residuals of $Ax=b$ are computed by
Suppose now that the claim holds for all $m=1, 2, \hdots, k-1$, that is, $\forall m \in \{1,2,\hdots, k-1\}$, $||r_m|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||$.
We will show that the statement holds too for $r_k$. Two situations can occur:
\begin{itemize}
-\item If $k \mod m \neq 0$, then the TSIRM algorithm consists in executing GMRES once. In that case, we obtain $||r_k|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{m}{2}} ||r_{k-1}||\leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{m}{2}} ||r_0||$.
-
-\item Else, let $\operatorname{span}(S) = \left \{ {\sum_{i=1}^k \lambda_i v_i \Big| k \in \mathbb{N}, v_i \in S, \lambda _i \in \mathbb{R}} \right \}$ be the linear span of a set of real vectors $S$. So,\\
+\item If $k \mod m \neq 0$, then the TSIRM algorithm consists in executing GMRES once. In that case, we obtain $||r_k|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{m}{2}} ||r_{k-1}||\leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||$ by the inductive hypothesis.
+\item Else, the TSIRM algorithm consists in two stages: a first GMRES($m$) execution leads to a temporary $x_k$ whose residue satisfies $||r_k|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{m}{2}} ||r_{k-1}||\leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||$, and a least squares resolution.
+Let $\operatorname{span}(S) = \left \{ {\sum_{i=1}^k \lambda_i v_i \Big| k \in \mathbb{N}, v_i \in S, \lambda _i \in \mathbb{R}} \right \}$ be the linear span of a set of real vectors $S$. So,\\
$\min_{\alpha \in \mathbb{R}^s} ||b-R\alpha ||_2 = \min_{\alpha \in \mathbb{R}^s} ||b-AS\alpha ||_2$
$\begin{array}{ll}
-& = \min_{x \in span\left(S_{k-s}, S_{k-s+1}, \hdots, S_{k-1} \right)} ||b-AS\alpha ||_2\\
-& = \min_{x \in span\left(x_{k-s}, x_{k-s}+1, \hdots, x_{k-1} \right)} ||b-AS\alpha ||_2\\
-& \leqslant \min_{x \in span\left( x_{k-1} \right)} ||b-Ax ||_2\\
-& \leqslant \min_{\lambda \in \mathbb{R}} ||b-\lambda Ax_{k-1} ||_2\\
-& \leqslant ||b-Ax_{k-1}||_2 .
+& = \min_{x \in span\left(S_{k-s+1}, S_{k-s+2}, \hdots, S_{k} \right)} ||b-AS\alpha ||_2\\
+& = \min_{x \in span\left(x_{k-s+1}, x_{k-s}+2, \hdots, x_{k} \right)} ||b-AS\alpha ||_2\\
+& \leqslant \min_{x \in span\left( x_{k} \right)} ||b-Ax ||_2\\
+& \leqslant \min_{\lambda \in \mathbb{R}} ||b-\lambda Ax_{k} ||_2\\
+& \leqslant ||b-Ax_{k}||_2\\
+& = ||r_k||_2\\
+& \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||,
\end{array}$
\end{itemize}
+which concludes the induction and the proof.
\end{proof}
We can remark that, at each iterate, the residue of the TSIRM algorithm is lower
%%%*********************************************************
%%%*********************************************************
-
-future plan : \\
-- study other kinds of matrices, problems, inner solvers\\
-- test the influence of all parameters\\
-- adaptative number of outer iterations to minimize\\
-- other methods to minimize the residuals?\\
-- implement our solver inside PETSc
+A novel two-stage iterative algorithm has been proposed in this article,
+in order to accelerate the convergence Krylov iterative methods.
+Our TSIRM proposal acts as a merger between Krylov based solvers and
+a least-squares minimization step.
+The convergence of the method has been proven in some situations, while
+experiments up to 16,394 cores have been led to verify that TSIRM runs
+5 or 7 times faster than GMRES.
+
+
+For future work, the authors' intention is to investigate
+other kinds of matrices, problems, and inner solvers. The
+influence of all parameters must be tested too, while
+other methods to minimize the residuals must be regarded.
+The number of outer iterations to minimize should become
+adaptative to improve the overall performances of the proposal.
+Finally, this solver will be implemented inside PETSc.
% conference papers do not normally have an appendix