X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/GMRES2stage.git/blobdiff_plain/c58e487c34213347636db95b9de6e3db660b6c4b..26d38e217c09735a23eb667846b3869559154681:/IJHPCN/paper.tex diff --git a/IJHPCN/paper.tex b/IJHPCN/paper.tex index 7b78160..2e4cfb6 100644 --- a/IJHPCN/paper.tex +++ b/IJHPCN/paper.tex @@ -49,9 +49,7 @@ \makeatletter \def\theequation{\arabic{equation}} -%\JOURNALNAME{\TEN{\it Int. J. System Control and Information -%Processing, -%Vol. \theVOL, No. \theISSUE, \thePUBYEAR\hfill\thepage}}% +\JOURNALNAME{\TEN{\it International Journal of High Performance Computing and Networking}} % %\def\BottomCatch{% %\vskip -10pt @@ -73,10 +71,9 @@ \setcounter{page}{1} -\LRH{F. Wang et~al.} +\LRH{R. Couturier, L. Ziane Khodja and C. Guyeux} -\RRH{Metadata Based Management and Sharing of Distributed Biomedical -Data} +\RRH{TSIRM: A Two-Stage Iteration with least-squares Residual Minimization algorithm} \VOL{x} @@ -86,7 +83,7 @@ Data} \BottomCatch -\PUBYEAR{2012} +\PUBYEAR{2015} \subtitle{} @@ -109,19 +106,25 @@ Data} \begin{abstract} -In this article, a two-stage iterative algorithm is proposed to improve the +In this paper, a two-stage iterative algorithm is proposed to improve the convergence of Krylov based iterative methods, typically those of GMRES -variants. The principle of the proposed approach is to build an external -iteration over the Krylov method, and to frequently store its current residual +variants. The principle of the proposed approach is to build an external +iteration over the Krylov method, and to frequently store its current residual (at each GMRES restart for instance). After a given number of outer iterations, a least-squares minimization step is applied on the matrix composed by the saved -residuals, in order to compute a better solution and to make new iterations if -required. It is proven that the proposal has the same convergence properties -than the inner embedded method itself. Experiments using up to 16,394 cores -also show that the proposed algorithm runs around 5 or 7 times faster than -GMRES. +residuals, in order to compute a better solution and to make new iterations if +required. It is proven that the proposal has the same convergence properties +than the inner embedded method itself. +%%NEW +Several experiments have been performed +with the PETSc solver with linear and nonlinear problems. They show good +speedups compared to GMRES with up to 16,394 cores with different +preconditioners. +%%ENDNEW \end{abstract} + + \KEYWORD{Iterative Krylov methods; sparse linear and non linear systems; two stage iteration; least-squares residual minimization; PETSc.} %\REF{to this paper should be made as follows: Rodr\'{\i}guez @@ -131,28 +134,11 @@ GMRES. %Semantics and Ontologies}, Vol. x, No. x, pp.xxx\textendash xxx.} \begin{bio} -Manuel Pedro Rodr\'iguez Bol\'ivar received his PhD in Accounting at -the University of Granada. He is a Lecturer at the Department of -Accounting and Finance, University of Granada. His research -interests include issues related to conceptual frameworks of -accounting, diffusion of financial information on Internet, Balanced -Scorecard applications and environmental accounting. He is author of -a great deal of research studies published at national and -international journals, conference proceedings as well as book -chapters, one of which has been edited by Kluwer Academic -Publishers.\vs{9} - -\noindent Bel\'en Sen\'es Garc\'ia received her PhD in Accounting at -the University of Granada. She is a Lecturer at the Department of -Accounting and Finance, University of Granada. Her research -interests are related to cultural, institutional and historic -accounting and in environmental accounting. She has published -research papers at national and international journals, conference -proceedings as well as chapters of books.\vs{8} - -\noindent Both authors have published a book about environmental -accounting edited by the Institute of Accounting and Auditing, -Ministry of Economic Affairs, in Spain in October 2003. +Raphaël Couturier .... + +\noindent Lilia Ziane Khodja ... + +\noindent Christophe Guyeux ... \end{bio} @@ -808,16 +794,16 @@ Concerning the experiments some other remarks are interesting. %%NEW \begin{table*}[htbp] \begin{center} -\begin{tabular}{|r|r|r|r|r|r|r|} +\begin{tabular}{|r|r|r|r|r|r|r|r|} \hline - nb. cores & \multicolumn{2}{c|}{FGMRES/ASM} & \multicolumn{2}{c|}{TSIRM CGLS/ASM} & \multicolumn{2}{c|}{FGMRES/HYPRE} \\ -\cline{2-7} - & Time & \# Iter. & Time & \# Iter. & Time & \# Iter. \\\hline \hline - 512 & 5.54 & 685 & 2.5 & 570 & 128.9 & 9 \\ - 2048 & 14.95 & 1,560 & 4.32 & 746 & 335.7 & 9 \\ - 4096 & 25.13 & 2,369 & 5.61 & 859 & >1000 & -- \\ - 8192 & 44.35 & 3,197 & 7.6 & 1083 & >1000 & -- \\ + nb. cores & \multicolumn{2}{c|}{FGMRES/ASM} & \multicolumn{2}{c|}{TSIRM CGLS/ASM} & gain& \multicolumn{2}{c|}{FGMRES/HYPRE} \\ +\cline{2-5} \cline{7-8} + & Time & \# Iter. & Time & \# Iter. & & Time & \# Iter. \\\hline \hline + 512 & 5.54 & 685 & 2.5 & 570 & 2.21 & 128.9 & 9 \\ + 2048 & 14.95 & 1,560 & 4.32 & 746 & 3.48 & 335.7 & 9 \\ + 4096 & 25.13 & 2,369 & 5.61 & 859 & 4.48 & >1000 & -- \\ + 8192 & 44.35 & 3,197 & 7.6 & 1083 & 5.84 & >1000 & -- \\ \hline @@ -835,6 +821,51 @@ Concerning the experiments some other remarks are interesting. \label{fig:03} \end{figure} + + +\begin{table*}[htbp] +\begin{center} +\begin{tabular}{|r|r|r|r|r|r|} +\hline + + nb. cores & \multicolumn{2}{c|}{FGMRES/BJAC} & \multicolumn{2}{c|}{TSIRM CGLS/BJAC} & gain \\ +\cline{2-5} + & Time & \# Iter. & Time & \# Iter. & \\\hline \hline + 1024 & 667.92 & 48,732 & 81.65 & 5,087 & 8.18 \\ + 2048 & 966.87 & 77,177 & 90.34 & 5,716 & 10.70\\ + 4096 & 1,742.31 & 124,411 & 119.21 & 6,905 & 14.61\\ + 8192 & 2,739.21 & 187,626 & 168.9 & 9,000 & 16.22\\ + +\hline + +\end{tabular} +\caption{Comparison of FGMRES and TSIRM for ex20 of PETSc/SNES with a Block Jacobi preconditioner having 100,000 components per core on Curie ($\epsilon_{tsirm}=1e-10$, $max\_iter_{kryl}=30$, $s=12$, $max\_iter_{ls}=15$, $\epsilon_{ls}=1e-40$), time is expressed in seconds.} +\label{tab:07} +\end{center} +\end{table*} + +\begin{table*}[htbp] +\begin{center} +\begin{tabular}{|r|r|r|r|r|r|} +\hline + + nb. cores & \multicolumn{2}{c|}{FGMRES/BJAC} & \multicolumn{2}{c|}{TSIRM CGLS/BJAC} & gain \\ +\cline{2-5} + & Time & \# Iter. & Time & \# Iter. & \\\hline \hline + 1024 & 159.52 & 11,584 & 26.34 & 1,563 & 6.06 \\ + 2048 & 226.24 & 16,459 & 37.23 & 2,248 & 6.08\\ + 4096 & 391.21 & 27,794 & 50.93 & 2,911 & 7.69\\ + 8192 & 543.23 & 37,770 & 79.21 & 4,324 & 6.86 \\ + +\hline + +\end{tabular} +\caption{Comparison of FGMRES and TSIRM for ex14 of PETSc/SNES with a Block Jacobi preconditioner having 100,000 components per core on Curie ($\epsilon_{tsirm}=1e-10$, $max\_iter_{kryl}=30$, $s=12$, $max\_iter_{ls}=15$, $\epsilon_{ls}=1e-40$), time is expressed in seconds.} +\label{tab:08} +\end{center} +\end{table*} + + %%ENDNEW %%%*********************************************************