Relecture

[GMRES2stage.git] / paper.tex
diff --git a/paper.tex b/paper.tex

index 012e7f1fbfa748a2469867c005f7f1af70ff03b8..a4545fd4733e8da1759f78dd354195c238bf34c7 100644 (file)
--- a/paper.tex
+++ b/paper.tex
@@ -381,7 +381,7 @@
  % affiliations
  
  \author{\IEEEauthorblockN{Rapha\"el Couturier\IEEEauthorrefmark{1}, Lilia Ziane Khodja\IEEEauthorrefmark{2}, and Christophe Guyeux\IEEEauthorrefmark{1}}
-\IEEEauthorblockA{\IEEEauthorrefmark{1} Femto-ST Institute, University of Franche Comte, France\\
+\IEEEauthorblockA{\IEEEauthorrefmark{1} Femto-ST Institute, University of Franche-Comt\'e, France\\
  Email: \{raphael.couturier,christophe.guyeux\}@univ-fcomte.fr}
  \IEEEauthorblockA{\IEEEauthorrefmark{2} INRIA Bordeaux Sud-Ouest, France\\
  Email: lilia.ziane@inria.fr}
@@ -564,7 +564,7 @@ gradient and GMRES ones (Generalized Minimal RESidual).
  
  However,  iterative  methods suffer  from scalability  problems  on parallel
  computing  platforms  with many  processors, due  to  their need  of  reduction
-operations, and to  collective    communications   to  achive   matrix-vector
+operations, and to  collective    communications   to  achieve   matrix-vector
  multiplications. The  communications on large  clusters with thousands  of cores
  and  large  sizes of  messages  can  significantly  affect the  performances  of these
  iterative methods. As a consequence, Krylov subspace iteration methods are often used
@@ -621,19 +621,20 @@ outer solver periodically applies a least-squares minimization  on the residuals
  At each outer iteration, the sparse linear system $Ax=b$ is partially 
  solved using only $m$
  iterations of an iterative method, this latter being initialized with the 
-best known approximation previously obtained. 
-GMRES method~\cite{Saad86}, or any of its variants, can be used for instance as an
-inner solver. The current approximation of the Krylov method is then stored inside a matrix
-$S$ composed by the successive solutions that are computed during inner iterations.
+last obtained approximation. 
+GMRES method~\cite{Saad86}, or any of its variants, can potentially be used as
+inner solver. The current approximation of the Krylov method is then stored inside a $n \times s$ matrix
+$S$, which is composed by the $s$ last solutions that have been computed during 
+the inner iterations phase.
  
-At each $s$ iterations, the minimization step is applied in order to
+At each $s$ iterations, another kind of minimization step is applied in order to
  compute a new  solution $x$. For that, the previous  residuals of $Ax=b$ are computed by
  the inner iterations with $(b-AS)$. The minimization of the residuals is obtained by  
  \begin{equation}
     \underset{\alpha\in\mathbb{R}^{s}}{min}\|b-R\alpha\|_2
  \label{eq:01}
  \end{equation}
-with $R=AS$. Then the new solution $x$ is computed with $x=S\alpha$.
+with $R=AS$. The new solution $x$ is then computed with $x=S\alpha$.
  
  
  In  practice, $R$  is a  dense rectangular  matrix belonging in  $\mathbb{R}^{n\times s}$,
@@ -663,8 +664,8 @@ appropriate than a single direct method in a parallel context.
  \label{algo:01}
  \end{algorithm}
  
-Algorithm~\ref{algo:01}  summarizes  the principle  of  our  method.  The  outer
-iteration is  inside the for  loop. Line~\ref{algo:solve}, the Krylov  method is
+Algorithm~\ref{algo:01}  summarizes  the principle  of  the proposed  method.  The  outer
+iteration is  inside the \emph{for}  loop. Line~\ref{algo:solve}, the Krylov  method is
  called for a  maximum of $max\_iter_{kryl}$ iterations.  In practice, we  suggest to set this parameter
  equals to  the restart  number of the  GMRES-like method. Moreover,  a tolerance
  threshold must be specified for the  solver. In practice, this threshold must be
@@ -774,13 +775,16 @@ Let $\operatorname{span}(S) = \left \{ {\sum_{i=1}^k \lambda_i v_i \Big| k \in \
  $\min_{\alpha \in \mathbb{R}^s} ||b-R\alpha ||_2 = \min_{\alpha \in \mathbb{R}^s} ||b-AS\alpha ||_2$
  
  $\begin{array}{ll}
-& = \min_{x \in span\left(S_{k-s}, S_{k-s+1}, \hdots, S_{k-1} \right)} ||b-AS\alpha ||_2\\
-& = \min_{x \in span\left(x_{k-s}, x_{k-s}+1, \hdots, x_{k-1} \right)} ||b-AS\alpha ||_2\\
-& \leqslant \min_{x \in span\left( x_{k-1} \right)} ||b-Ax ||_2\\
-& \leqslant \min_{\lambda \in \mathbb{R}} ||b-\lambda Ax_{k-1} ||_2\\
-& \leqslant ||b-Ax_{k-1}||_2 .
+& = \min_{x \in span\left(S_{k-s+1}, S_{k-s+2}, \hdots, S_{k} \right)} ||b-AS\alpha ||_2\\
+& = \min_{x \in span\left(x_{k-s+1}, x_{k-s}+2, \hdots, x_{k} \right)} ||b-AS\alpha ||_2\\
+& \leqslant \min_{x \in span\left( x_{k} \right)} ||b-Ax ||_2\\
+& \leqslant \min_{\lambda \in \mathbb{R}} ||b-\lambda Ax_{k} ||_2\\
+& \leqslant ||b-Ax_{k}||_2\\
+& = ||r_k||_2\\
+& \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||,
  \end{array}$
  \end{itemize}
+which concludes the induction and the proof.
  \end{proof}
  
  We can remark that, at each iterate, the residue of the TSIRM algorithm is lower 
@@ -1026,13 +1030,22 @@ In Table~\ref{tab:04}, some experiments with example ex54 on the Curie architect
  %%%*********************************************************
  %%%*********************************************************
  
-
-future plan : \\
-- study other kinds of matrices, problems, inner solvers\\
-- test the influence of all parameters\\
-- adaptative number of outer iterations to minimize\\
-- other methods to minimize the residuals?\\
-- implement our solver inside PETSc
+A novel two-stage iterative  algorithm has been proposed in this article,
+in order to accelerate the convergence Krylov iterative  methods.
+Our TSIRM proposal acts as a merger between Krylov based solvers and
+a least-squares minimization step.
+The convergence of the method has been proven in some situations, while 
+experiments up to 16,394 cores have been led to verify that TSIRM runs
+5 or  7 times  faster than GMRES.
+
+
+For future work, the authors' intention is to investigate 
+other kinds of matrices, problems, and inner solvers. The 
+influence of all parameters must be tested too, while 
+other methods to minimize the residuals must be regarded.
+The number of outer iterations to minimize should become 
+adaptative to improve the overall performances of the proposal.
+Finally, this solver will be implemented inside PETSc.
  
  
  % conference papers do not normally have an appendix