chosen because they are scalable with many cores. We have tested other problem
but they are not scalable with many cores.
+In the following larger experiments are described on two large scale architectures: Curie and Juqeen... {\bf description...}\\
+{\bf Description of preconditioners}
\begin{table*}
\begin{center}
\begin{tabular}{|r|r|r|r|r|r|r|r|r|}
\hline
- nb. cores & precond & \multicolumn{2}{c|}{GMRES} & \multicolumn{2}{c|}{TSARM CGLS} & \multicolumn{2}{c|}{TSARM LSQR} & best gain \\
+ nb. cores & precond & \multicolumn{2}{c|}{FGMRES} & \multicolumn{2}{c|}{TSARM CGLS} & \multicolumn{2}{c|}{TSARM LSQR} & best gain \\
\cline{3-8}
& & Time & \# Iter. & Time & \# Iter. & Time & \# Iter. & \\\hline \hline
2,048 & mg & 403.49 & 18,210 & 73.89 & 3,060 & 77.84 & 3,270 & 5.46 \\
\hline
\end{tabular}
-\caption{Comparison of FGMRES and 2 stage FGMRES algorithms for ex15 of Petsc with 25000 components per core on Juqueen (threshold 1e-3, restart=30, s=12), time is expressed in seconds.}
+\caption{Comparison of FGMRES and TSARM with FGMRES for example ex15 of PETSc with two preconditioner (mg and sor) with 25,000 components per core on Juqueen (threshold 1e-3, restart=30, s=12), time is expressed in seconds.}
\label{tab:03}
\end{center}
\end{table*}
+Table~\ref{tab:03} shows the execution times and the number of iterations of
+example ex15 of PETSc on the Juqueen architecture. Differents number of cores
+are studied rangin from 2,048 upto 16,383. Two preconditioners have been
+tested. For those experiments, the number of components (or unknown of the
+problems) per processor is fixed to 25,000. This number can seem relatively
+small. In fact, for some applications that need a lot of memory, the number of
+components per processor requires sometimes to be small.
+
+In this Table, we can notice that TSARM is always faster than FGMRES. The last
+column shows the ratio between FGMRES and the best version of TSARM according to
+the minimization procedure: CGLS or LSQR.
+
\begin{figure}
\centering