X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/GMRES2stage.git/blobdiff_plain/70ad0cb4dcbf00dcc4bade65051bf0273854d231..fc8e999ae15f3227ad6bfa96711b70236c809e77:/paper.tex diff --git a/paper.tex b/paper.tex index 51eab5c..c559dda 100644 --- a/paper.tex +++ b/paper.tex @@ -374,6 +374,9 @@ %\title{???} + + + % author names and affiliations % use a multiple column layout for up to two different % affiliations @@ -543,12 +546,14 @@ Iterative Krylov methods; sparse linear systems; error minimization; PETSc; %à % no \IEEEPARstart % You must have at least 2 lines in the paragraph with the drop letter % (should never be an issue) -Iterative methods are become more attractive than direct ones to solve very -large sparse linear systems. They are more effective in a parallel context and -require less memory and arithmetic operations than direct methods. A number of -iterative methods are proposed and adapted by many researchers and the increased -need for solving very large sparse linear systems triggered the development of -efficient iterative techniques suitable for the parallel processing. + +Iterative methods became more attractive than direct ones to solve very large +sparse linear systems. Iterative methods are more effecient in a parallel +context, with thousands of cores, and require less memory and arithmetic +operations than direct methods. A number of iterative methods are proposed and +adapted by many researchers and the increased need for solving very large sparse +linear systems triggered the development of efficient iterative techniques +suitable for the parallel processing. Most of the successful iterative methods currently available are based on Krylov subspaces which consist in forming a basis of a sequence of successive matrix @@ -570,13 +575,16 @@ large clusters. In this paper we propose a two-stage algorithm based on two nested iterations called inner-outer iterations. This algorithm consists in solving the sparse linear system iteratively with a small number of inner iterations and restarts -the outer step with a new solution minimizing some error functions over a Krylov -subspace. This algorithm is iterative and easy to parallelize on large clusters -and the minimization technique improves its convergence and performances. +the outer step with a new solution minimizing some error functions over some +previous residuals. This algorithm is iterative and easy to parallelize on large +clusters and the minimization technique improves its convergence and +performances. The present paper is organized as follows. In Section~\ref{sec:02} some related -works are presented. Section~\ref{sec:03} presents our two-stage algorithm based -on Krylov subspace iteration methods. Section~\ref{sec:04} shows some +works are presented. Section~\ref{sec:03} presents our two-stage algorithm using +a least-square residual minimization. Section~\ref{sec:04} describes some +convergence results on this method. + Section~\ref{sec:05} shows some experimental results obtained on large clusters of our algorithm using routines of PETSc toolkit. %%%********************************************************* @@ -661,12 +669,13 @@ reused with the new values of the residuals. %%%********************************************************* %%%********************************************************* - +\section{Convergence results} +\label{sec:04} %%%********************************************************* %%%********************************************************* \section{Experiments using petsc} -\label{sec:04} +\label{sec:05} In order to see the influence of our algorithm with only one processor, we first @@ -771,7 +780,28 @@ Larger experiments .... \end{table*} +\begin{table*} +\begin{center} +\begin{tabular}{|r|r|r|r|r|r|r|r|r|} +\hline + nb. cores & threshold & \multicolumn{2}{c|}{gmres variant} & \multicolumn{2}{c|}{2 stage CGLS} & \multicolumn{2}{c|}{2 stage LSQR} & best gain \\ +\cline{3-8} + & & Time & \# Iter. & Time & \# Iter. & Time & \# Iter. & \\\hline \hline + 2,048 & 8e-5 & 108.88 & 16,560 & 23.06 & 3,630 & 22.79 & 3,630 & 4.77 \\ + 2,048 & 6e-5 & 194.01 & 30,270 & 35.50 & 5,430 & 27.74 & 4,350 & 6.99 \\ + 4,096 & 7e-5 & 160.59 & 22,530 & 35.15 & 5,130 & 29.21 & 4,350 & 5.49 \\ + 4,096 & 6e-5 & 249.27 & 35,520 & 52.13 & 7,950 & 39.24 & 5,790 & 6.35 \\ + 8,192 & 6e-5 & 149.54 & 17,280 & 28.68 & 3,810 & 29.05 & 3,990 & 5.21 \\ + 8,192 & 5e-5 & 792.11 & 109,590 & 76.83 & 10,470 & 65.20 & 9,030 & 12.14 \\ + 16,384 & 4e-5 & 718.61 & 86,400 & 98.98 & 10,830 & 131.86 & 14,790 & 7.26 \\ +\hline + +\end{tabular} +\caption{Comparison of FGMRES and 2 stage FGMRES algorithms for ex54 of Petsc (both with the MG preconditioner) with 25000 components per core on Curie (restart=30, s=12), time is expressed in seconds.} +\label{tab:04} +\end{center} +\end{table*} %%%********************************************************* %%%********************************************************* @@ -789,6 +819,7 @@ Larger experiments .... future plan : \\ - study other kinds of matrices, problems, inner solvers\\ - adaptative number of outer iterations to minimize\\ +- other methods to minimize the residuals?\\ - implement our solver inside PETSc