executions on large scale supercomputers~\cite{couturier15}.
-\subsection{Comparing GMRES in native synchronous mode and the multisplitting algorithm in asynchronous mode}
+\subsection{Comparison between synchronous GMRES and asynchronous two-stage multisplitting algorithms}
The previous paragraphs put in evidence the interests to simulate the behavior
of the application before any deployment in a real environment. In this
theoretically reduce the overall execution time and can improve the algorithm
performance.
-In this section, the Simgrid simulator is used to compare the behavior of the
-multisplitting in asynchronous mode with GMRES in synchronous mode. Several
-benchmarks have been performed with various combination of the grid resources
-(CPU, Network, input matrix size, \ldots ). The test conditions are summarized
-in Table~\ref{tab:07}. In order to compare the execution times, this table
+In this section, the SimGrid simulator is used to compare the behavior of the
+two-stage algorithm in asynchronous mode with GMRES in synchronous mode. Several
+benchmarks have been performed with various combinations of the grid resources
+(CPU, Network, matrix size, \ldots). The test conditions are summarized
+in Table~\ref{tab:07}. In order to compare the execution times, this table
reports the relative gain between both algorithms. It is defined by the ratio
between the execution time of GMRES and the execution time of the
-multisplitting. The ratio is greater than one because the asynchronous
+multisplitting.
+\LZK{Quelle table repporte les gains relatifs?? Sûrement pas Table II !!}
+The ratio is greater than one because the asynchronous
multisplitting version is faster than GMRES.
-
-
-\begin{table} [htbp]
+\begin{table}[htbp]
\centering
-\begin{tabular}{r c }
+\begin{tabular}{ll}
\hline
- Grid Architecture & 2 $\times$ 50 totaling 100 processors\\ %\hline
- Processors Power & 1 GFlops to 1.5 GFlops\\
- Intra-Network & bw=1.25 Gbits - lat=5.10$^{-5}$ \\ %\hline
- Inter-Network & bw=5 Mbits - lat=2.10$^{-2}$\\
- Input matrix size & $N_{x}$ = From 62 to 150\\ %\hline
- Residual error precision & 10$^{-5}$ to 10$^{-9}$\\ \hline \\
+ Grid architecture & 2$\times$50 totaling 100 processors\\
+ Processors Power & 1 GFlops to 1.5 GFlops \\
+ \multirow{2}{*}{Network inter-clusters} & $bw$=1.25 Gbits, $lat=50\mu$s \\
+ & $bw$=5 Mbits, $lat=20ms$s\\
+ Matrix size & from $62^3$ to $150^3$\\
+ Residual error precision & $10^{-5}$ to $10^{-9}$\\ \hline \\
\end{tabular}
-\caption{Test conditions: GMRES in synchronous mode vs Krylov Multisplitting in asynchronous mode}
+\caption{Test conditions: GMRES in synchronous mode vs. Krylov two-stage in asynchronous mode}
\label{tab:07}
\end{table}
-Again, comprehensive and extensive tests have been conducted with different
-parameters as the CPU power, the network parameters (bandwidth and latency)
-and with different problem size. The relative gains greater than $1$ between the
-two algorithms have been captured after each step of the test. In
-Table~\ref{tab:08} are reported the best grid configurations allowing
-the multisplitting method to be more than $2.5$ times faster than the
-classical GMRES. These experiments also show the relative tolerance of the
-multisplitting algorithm when using a low speed network as usually observed with
-geographically distant clusters through the internet.
% use the same column width for the following three tables
\newlength{\mytablew}\settowidth{\mytablew}{\footnotesize\np{E-11}}
\hline
\end{mytable}
%\end{table}
- \caption{Relative gain of the multisplitting algorithm compared with the classical GMRES}
+ \caption{Relative gains of the two-stage multisplitting algorithm compared with the classical GMRES}
\label{tab:08}
\end{table}
+Again, comprehensive and extensive tests have been conducted with different
+parameters as the CPU power, the network parameters (bandwidth and latency)
+and with different problem size. The relative gains greater than $1$ between the
+two algorithms have been captured after each step of the test. In
+Table~\ref{tab:08} are reported the best grid configurations allowing
+the two-stage multisplitting algorithm to be more than $2.5$ times faster than the
+classical GMRES. These experiments also show the relative tolerance of the
+multisplitting algorithm when using a low speed network as usually observed with
+geographically distant clusters through the internet.
+
\section{Conclusion}