new

[GMRES2stage.git] / paper.tex
diff --git a/paper.tex b/paper.tex

index 51eab5cb9fd26068dcd36d1868bf6a73a32f92f3..e23ccc2b938efd17c08059f6e03e03a367493ba0 100644 (file)
--- a/paper.tex
+++ b/paper.tex
@@ -374,6 +374,9 @@
  %\title{???}
  
  
+
+
+
  % author names and affiliations
  % use a multiple column layout for up to two different
  % affiliations
@@ -543,12 +546,14 @@ Iterative Krylov methods; sparse linear systems; error minimization; PETSc; %à
  % no \IEEEPARstart
  % You must have at least 2 lines in the paragraph with the drop letter
  % (should never be an issue)
-Iterative  methods are become  more attractive  than direct  ones to  solve very
-large sparse linear  systems. They are more effective in  a parallel context and
-require less memory  and arithmetic operations than direct  methods. A number of
-iterative methods are proposed and adapted by many researchers and the increased
-need for solving  very large sparse linear systems  triggered the development of
-efficient iterative techniques suitable for the parallel processing.
+
+Iterative methods  became more attractive than  direct ones to  solve very large
+sparse  linear systems.  Iterative  methods  are more  effecient  in a  parallel
+context,  with  thousands  of  cores,  and  require  less  memory  and  arithmetic
+operations than direct  methods. A number of iterative  methods are proposed and
+adapted by many researchers and the increased need for solving very large sparse
+linear  systems  triggered the  development  of  efficient iterative  techniques
+suitable for the parallel processing.
  
  Most of the successful iterative methods currently available are based on Krylov
  subspaces which  consist in forming a  basis of a sequence  of successive matrix
@@ -570,15 +575,18 @@ large clusters.
  In this  paper we propose a  two-stage algorithm based on  two nested iterations
  called inner-outer  iterations.  This algorithm  consists in solving  the sparse
  linear system iteratively  with a small number of  inner iterations and restarts
-the outer step with a new solution minimizing some error functions over a Krylov
-subspace. This algorithm is iterative  and easy to parallelize on large clusters
-and the minimization technique improves its convergence and performances.
+the outer  step with a  new solution minimizing  some error functions  over some
+previous residuals. This algorithm is iterative and easy to parallelize on large
+clusters   and  the   minimization  technique   improves  its   convergence  and
+performances.
  
  The present paper is organized  as follows. In Section~\ref{sec:02} some related
-works are presented. Section~\ref{sec:03} presents our two-stage algorithm based
-on   Krylov  subspace   iteration  methods.   Section~\ref{sec:04}   shows  some
+works are presented. Section~\ref{sec:03} presents our two-stage algorithm using
+a  least-square  residual  minimization.   Section~\ref{sec:04}  describes  some
+convergence   results   on  this   method.    Section~\ref{sec:05}  shows   some
  experimental results obtained on large  clusters of our algorithm using routines
-of PETSc toolkit.
+of  PETSc  toolkit.  Finally   Section~\ref{sec:06}  concludes  and  gives  some
+perspectives.
  %%%*********************************************************
  %%%*********************************************************
  
@@ -661,12 +669,13 @@ reused with the new values of the residuals.
  %%%*********************************************************
  %%%*********************************************************
  
-
+\section{Convergence results}
+\label{sec:04}
  
  %%%*********************************************************
  %%%*********************************************************
  \section{Experiments using petsc}
-\label{sec:04}
+\label{sec:05}
  
  
  In order to see the influence of our algorithm with only one processor, we first
@@ -675,7 +684,7 @@ table~\ref{tab:01},  we  show  the  matrices  we  have used  and  some  of  them
  characteristics. For all  the matrices, the name, the field,  the number of rows
  and the number of nonzero elements is given.
  
-\begin{table}
+\begin{table*}
  \begin{center}
  \begin{tabular}{|c|c|r|r|r|} 
  \hline
@@ -692,7 +701,7 @@ torso3             & 2D/3D problem & 259,156 & 4,429,042 \\
  \caption{Main characteristics of the sparse matrices chosen from the Davis collection}
  \label{tab:01}
  \end{center}
-\end{table}
+\end{table*}
  
  The following  parameters have been chosen  for our experiments.   As by default
  the restart  of GMRES is performed every  30 iterations, we have  chosen to stop
@@ -771,7 +780,28 @@ Larger experiments ....
  \end{table*}
  
  
+\begin{table*}
+\begin{center}
+\begin{tabular}{|r|r|r|r|r|r|r|r|r|} 
+\hline
+
+  nb. cores & threshold   & \multicolumn{2}{c|}{gmres variant} & \multicolumn{2}{c|}{2 stage CGLS} &  \multicolumn{2}{c|}{2 stage LSQR} & best gain \\ 
+\cline{3-8}
+             &                       & Time  & \# Iter.  & Time  & \# Iter. & Time  & \# Iter. & \\\hline \hline
+  2,048      & 8e-5                  & 108.88 & 16,560  & 23.06  &  3,630  & 22.79  & 3,630   & 4.77 \\
+  2,048      & 6e-5                  & 194.01 & 30,270  & 35.50  &  5,430  & 27.74  & 4,350   & 6.99 \\
+  4,096      & 7e-5                  & 160.59 & 22,530  & 35.15  &  5,130  & 29.21  & 4,350   & 5.49 \\
+  4,096      & 6e-5                  & 249.27 & 35,520  & 52.13  &  7,950  & 39.24  & 5,790   & 6.35 \\
+  8,192      & 6e-5                  & 149.54 & 17,280  & 28.68  &  3,810  & 29.05  & 3,990  & 5.21 \\
+  8,192      & 5e-5                  & 792.11 & 109,590 & 76.83  &  10,470  & 65.20  & 9,030  & 12.14 \\
+  16,384     & 4e-5                  & 718.61 & 86,400 & 98.98  &  10,830  & 131.86  & 14,790  & 7.26 \\
+\hline
  
+\end{tabular}
+\caption{Comparison of FGMRES  and 2 stage FGMRES algorithms for ex54 of Petsc (both with the MG preconditioner) with 25000 components per core on Curie (restart=30, s=12),  time is expressed in seconds.}
+\label{tab:04}
+\end{center}
+\end{table*}
  %%%*********************************************************
  %%%*********************************************************
  
@@ -780,7 +810,7 @@ Larger experiments ....
  %%%*********************************************************
  %%%*********************************************************
  \section{Conclusion}
-\label{sec:05}
+\label{sec:06}
  %The conclusion goes here. this is more of the conclusion
  %%%*********************************************************
  %%%*********************************************************
@@ -789,6 +819,7 @@ Larger experiments ....
  future plan : \\
  - study other kinds of matrices, problems, inner solvers\\
  - adaptative number of outer iterations to minimize\\
+- other methods to minimize the residuals?\\
  - implement our solver inside PETSc