\end{abstract}
\begin{IEEEkeywords}
-Iterative Krylov methods; sparse linear systems; residual minimization; PETSc; %à voir...
+Iterative Krylov methods; sparse linear systems; two stage iteration; least-squares residual minimization; PETSc
\end{IEEEkeywords}
% You must have at least 2 lines in the paragraph with the drop letter
% (should never be an issue)
-Iterative methods have recently become more attractive than direct ones to solve very large
-sparse linear systems. They are more efficient in a parallel
-context, supporting thousands of cores, and they require less memory and arithmetic
-operations than direct methods. This is why new iterative methods are frequently
-proposed or adapted by researchers, and the increasing need to solve very large sparse
-linear systems has triggered the development of such efficient iterative techniques
-suitable for parallel processing.
-
-Most of the successful iterative methods currently available are based on so-called ``Krylov
-subspaces''. They consist in forming a basis of successive matrix
-powers multiplied by an initial vector, which can be for instance the residual. These methods use vectors orthogonality of the Krylov subspace basis in order to solve linear
-systems. The most known iterative Krylov subspace methods are conjugate
-gradient and GMRES ones (Generalized Minimal RESidual).
-
-
-However, iterative methods suffer from scalability problems on parallel
-computing platforms with many processors, due to their need of reduction
-operations, and to collective communications to achieve matrix-vector
+Iterative methods have recently become more attractive than direct ones to solve
+very large sparse linear systems\cite{Saad2003}. They are more efficient in a
+parallel context, supporting thousands of cores, and they require less memory
+and arithmetic operations than direct methods~\cite{bahicontascoutu}. This is
+why new iterative methods are frequently proposed or adapted by researchers, and
+the increasing need to solve very large sparse linear systems has triggered the
+development of such efficient iterative techniques suitable for parallel
+processing.
+
+Most of the successful iterative methods currently available are based on
+so-called ``Krylov subspaces''. They consist in forming a basis of successive
+matrix powers multiplied by an initial vector, which can be for instance the
+residual. These methods use vectors orthogonality of the Krylov subspace basis
+in order to solve linear systems. The most known iterative Krylov subspace
+methods are conjugate gradient and GMRES ones (Generalized Minimal RESidual).
+
+
+However, iterative methods suffer from scalability problems on parallel
+computing platforms with many processors, due to their need of reduction
+operations, and to collective communications to achieve matrix-vector
multiplications. The communications on large clusters with thousands of cores
-and large sizes of messages can significantly affect the performances of these
-iterative methods. As a consequence, Krylov subspace iteration methods are often used
-with preconditioners in practice, to increase their convergence and accelerate their
-performances. However, most of the good preconditioners are not scalable on
-large clusters.
-
-In this research work, a two-stage algorithm based on two nested iterations
-called inner-outer iterations is proposed. This algorithm consists in solving the sparse
-linear system iteratively with a small number of inner iterations, and restarting
-the outer step with a new solution minimizing some error functions over some
-previous residuals. This algorithm is iterative and easy to parallelize on large
-clusters. Furthermore, the minimization technique improves its convergence and
-performances.
+and large sizes of messages can significantly affect the performances of these
+iterative methods. As a consequence, Krylov subspace iteration methods are often
+used with preconditioners in practice, to increase their convergence and
+accelerate their performances. However, most of the good preconditioners are
+not scalable on large clusters.
+
+In this research work, a two-stage algorithm based on two nested iterations
+called inner-outer iterations is proposed. This algorithm consists in solving
+the sparse linear system iteratively with a small number of inner iterations,
+and restarting the outer step with a new solution minimizing some error
+functions over some previous residuals. For further information on two-stage
+iteration methods, interested readers are invited to
+consult~\cite{Nichols:1973:CTS}. Two-stage algorithms are easy to parallelize on
+large clusters. Furthermore, the least-squares minimization technique improves
+its convergence and performances.
The present article is organized as follows. Related works are presented in
Section~\ref{sec:02}. Section~\ref{sec:03} details the two-stage algorithm using
iterations phase. In the remainder, the $i$-th column vector of $S$ will be
denoted by $S_i$.
+$\|r_n\| \leq \left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{n/2} \|r_0\|,$
+In the general case, where A is not positive definite, we have
+
+$\|r_n\| \le \inf_{p \in P_n} \|p(A)\| \le \kappa_2(V) \inf_{p \in P_n} \max_{\lambda \in \sigma(A)} |p(\lambda)| \|r_0\|, \,$
+
+
At each $s$ iterations, another kind of minimization step is applied in order to
compute a new solution $x$. For that, the previous residuals of $Ax=b$ are computed by
the inner iterations with $(b-AS)$. The minimization of the residuals is obtained by
\State Set the initial guess $x_0$
\For {$k=1,2,3,\ldots$ until convergence (error$<\epsilon_{tsirm}$)} \label{algo:conv}
\State $[x_k,error]=Solve(A,b,x_{k-1},max\_iter_{kryl})$ \label{algo:solve}
- \State $S_{k \mod s}=x_k$ \label{algo:store}
+ \State $S_{k \mod s}=x_k$ \label{algo:store} \Comment{update column (k mod s) of S}
\If {$k \mod s=0$ {\bf and} error$>\epsilon_{kryl}$}
\State $R=AS$ \Comment{compute dense matrix} \label{algo:matrix_mul}
\State $\alpha=Least\_Squares(R,b,max\_iter_{ls})$ \label{algo:}
In order to see the influence of our algorithm with only one processor, we first
-show a comparison with the standard version of GMRES and our algorithm. In
-Table~\ref{tab:01}, we show the matrices we have used and some of them
-characteristics. For all the matrices, the name, the field, the number of rows
-and the number of nonzero elements are given.
+show a comparison with GMRES or FGMRES and our algorithm. In Table~\ref{tab:01},
+we show the matrices we have used and some of them characteristics. Those
+matrices are chosen from the Davis collection of the University of
+Florida~\cite{Dav97}. They are matrices arising in real-world applications. For
+all the matrices, the name, the field, the number of rows and the number of
+nonzero elements are given.
\begin{table}[htbp]
\begin{center}
In Table~\ref{tab:02}, some experiments comparing the solving of the linear
systems obtained with the previous matrices with a GMRES variant and with out 2
stage algorithm are given. In the second column, it can be noticed that either
-gmres or fgmres is used to solve the linear system. According to the matrices,
-different preconditioner is used. With TSIRM, the same solver and the same
-preconditionner are used. This Table shows that TSIRM can drastically reduce the
-number of iterations to reach the convergence when the number of iterations for
-the normal GMRES is more or less greater than 500. In fact this also depends on
-tow parameters: the number of iterations to stop GMRES and the number of
-iterations to perform the minimization.
+GRMES or FGMRES (Flexible GMRES)~\cite{Saad:1993} is used to solve the linear
+system. According to the matrices, different preconditioner is used. With
+TSIRM, the same solver and the same preconditionner are used. This Table shows
+that TSIRM can drastically reduce the number of iterations to reach the
+convergence when the number of iterations for the normal GMRES is more or less
+greater than 500. In fact this also depends on tow parameters: the number of
+iterations to stop GMRES and the number of iterations to perform the
+minimization.
\begin{table}[htbp]