X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/GMRES2stage.git/blobdiff_plain/534d3695d610fd2e8face91f6516d39e2473580d..299e2e52d19b38d91f622d1eb8b7af2bb44c7685:/paper.tex diff --git a/paper.tex b/paper.tex index 463fe2c..4757973 100644 --- a/paper.tex +++ b/paper.tex @@ -370,10 +370,7 @@ % paper title % can use linebreaks \\ within to get better formatting as desired \title{TSIRM: A Two-Stage Iteration with least-square Residual Minimization algorithm to solve large sparse linear systems} -%où -%\title{A two-stage algorithm with error minimization to solve large sparse linear systems} -%où -%\title{???} + @@ -607,7 +604,7 @@ is summarized while intended perspectives are provided. %%%********************************************************* %%%********************************************************* -\section{Two-stage algorithm with least-square residuals minimization} +\section{Two-stage iteration with least-square residuals minimization algorithm} \label{sec:03} A two-stage algorithm is proposed to solve large sparse linear systems of the form $Ax=b$, where $A\in\mathbb{R}^{n\times n}$ is a sparse and square @@ -639,7 +636,7 @@ with $R=AS$. Then the new solution $x$ is computed with $x=S\alpha$. In practice, $R$ is a dense rectangular matrix belonging in $\mathbb{R}^{n\times s}$, -with $s\ll n$. In order to minimize~(\eqref{eq:01}), a least-square method such as +with $s\ll n$. In order to minimize~\eqref{eq:01}, a least-square method such as CGLS ~\cite{Hestenes52} or LSQR~\cite{Paige82} is used. Remark that these methods are more appropriate than a single direct method in a parallel context. @@ -675,7 +672,7 @@ $\epsilon_{tsirm}$). Line~\ref{algo:store}, $S_{k~ mod~ s}=x^k$ consists in cop solution $x_k$ into the column $k~ mod~ s$ of the matrix $S$. After the minimization, the matrix $S$ is reused with the new values of the residuals. To solve the minimization problem, an iterative method is used. Two parameters are -required for that: the maximum number of iteration and the threshold to stop the +required for that: the maximum number of iterations and the threshold to stop the method. Let us summarize the most important parameters of TSIRM: @@ -698,7 +695,7 @@ colums in practice. As explained previously, at least two methods seem to be interesting to solve the least-square minimization, CGLS and LSQR. In the following we remind the CGLS algorithm. The LSQR method follows more or -less the same principle but it take more place, so we briefly explain the parallelization of CGLS which is similar to LSQR. +less the same principle but it takes more place, so we briefly explain the parallelization of CGLS which is similar to LSQR. \begin{algorithm}[t] \caption{CGLS} @@ -725,7 +722,7 @@ less the same principle but it take more place, so we briefly explain the parall In each iteration of CGLS, there is two matrix-vector multiplications and some -classical operations: dots, norm, multiplication and addition on vectors. All +classical operations: dot product, norm, multiplication and addition on vectors. All these operations are easy to implement in PETSc or similar environment. @@ -757,7 +754,7 @@ In order to see the influence of our algorithm with only one processor, we first show a comparison with the standard version of GMRES and our algorithm. In Table~\ref{tab:01}, we show the matrices we have used and some of them characteristics. For all the matrices, the name, the field, the number of rows -and the number of nonzero elements is given. +and the number of nonzero elements are given. \begin{table}[htbp] \begin{center} @@ -780,7 +777,7 @@ torso3 & 2D/3D problem & 259,156 & 4,429,042 \\ The following parameters have been chosen for our experiments. As by default the restart of GMRES is performed every 30 iterations, we have chosen to stop -the GMRES every 30 iterations, $max\_iter_{kryl}=30$). $s$ is set to 8. CGLS is +the GMRES every 30 iterations (\emph{i.e.} $max\_iter_{kryl}=30$). $s$ is set to 8. CGLS is chosen to minimize the least-squares problem with the following parameters: $\epsilon_{ls}=1e-40$ and $max\_iter_{ls}=20$. The external precision is set to $\epsilon_{tsirm}=1e-10$. Those experiments have been performed on a Intel(R)