\includegraphics[width=100mm]{cluster_x_nodes_n1_x_n2.pdf}
\caption{Various grid configurations with networks $N1$ vs. $N2$}
\LZK{CE, remplacer les ``,'' des décimales par un ``.''}
+\RCE{ok}
\label{fig:02}
\end{figure}
executions on large scale supercomputers~\cite{couturier15}.
-\subsection{Comparing GMRES in native synchronous mode and the multisplitting algorithm in asynchronous mode}
+\subsection{Comparison between synchronous GMRES and asynchronous two-stage multisplitting algorithms}
The previous paragraphs put in evidence the interests to simulate the behavior
of the application before any deployment in a real environment. In this
theoretically reduce the overall execution time and can improve the algorithm
performance.
-In this section, the Simgrid simulator is used to compare the behavior of the
-multisplitting in asynchronous mode with GMRES in synchronous mode. Several
-benchmarks have been performed with various combination of the grid resources
-(CPU, Network, input matrix size, \ldots ). The test conditions are summarized
-in Table~\ref{tab:07}. In order to compare the execution times, this table
+In this section, the SimGrid simulator is used to compare the behavior of the
+two-stage algorithm in asynchronous mode with GMRES in synchronous mode. Several
+benchmarks have been performed with various combinations of the grid resources
+(CPU, Network, matrix size, \ldots). The test conditions are summarized
+in Table~\ref{tab:02}. In order to compare the execution times, Table~\ref{tab:03}
reports the relative gain between both algorithms. It is defined by the ratio
between the execution time of GMRES and the execution time of the
-multisplitting. The ratio is greater than one because the asynchronous
+multisplitting.
+\LZK{Quelle table repporte les gains relatifs?? Sûrement pas Table II !!}
+\RCE{Table III avec la nouvelle numerotation}
+The ratio is greater than one because the asynchronous
multisplitting version is faster than GMRES.
-
-
-\begin{table} [htbp]
+\begin{table}[htbp]
\centering
-\begin{tabular}{r c }
+\begin{tabular}{ll}
\hline
- Grid Architecture & 2 $\times$ 50 totaling 100 processors\\ %\hline
- Processors Power & 1 GFlops to 1.5 GFlops\\
- Intra-Network & bw=1.25 Gbits - lat=5.10$^{-5}$ \\ %\hline
- Inter-Network & bw=5 Mbits - lat=2.10$^{-2}$\\
- Input matrix size & $N_{x}$ = From 62 to 150\\ %\hline
- Residual error precision & 10$^{-5}$ to 10$^{-9}$\\ \hline \\
+ Grid architecture & 2$\times$50 totaling 100 processors\\
+ Processors Power & 1 GFlops to 1.5 GFlops \\
+ \multirow{2}{*}{Network inter-clusters} & $bw$=1.25 Gbits, $lat=50\mu$s \\
+ & $bw$=5 Mbits, $lat=20ms$s\\
+ Matrix size & from $62^3$ to $150^3$\\
+ Residual error precision & $10^{-5}$ to $10^{-9}$\\ \hline \\
\end{tabular}
-\caption{Test conditions: GMRES in synchronous mode vs Krylov Multisplitting in asynchronous mode}
-\label{tab:07}
+\caption{Test conditions: GMRES in synchronous mode vs. Krylov two-stage in asynchronous mode}
+\label{tab:02}
\end{table}
-Again, comprehensive and extensive tests have been conducted with different
-parameters as the CPU power, the network parameters (bandwidth and latency)
-and with different problem size. The relative gains greater than $1$ between the
-two algorithms have been captured after each step of the test. In
-Table~\ref{tab:08} are reported the best grid configurations allowing
-the multisplitting method to be more than $2.5$ times faster than the
-classical GMRES. These experiments also show the relative tolerance of the
-multisplitting algorithm when using a low speed network as usually observed with
-geographically distant clusters through the internet.
% use the same column width for the following three tables
\newlength{\mytablew}\settowidth{\mytablew}{\footnotesize\np{E-11}}
\hline
\end{mytable}
%\end{table}
- \caption{Relative gain of the multisplitting algorithm compared with the classical GMRES}
- \label{tab:08}
+ \caption{Relative gains of the two-stage multisplitting algorithm compared with the classical GMRES}
+ \label{tab:03}
\end{table}
+Again, comprehensive and extensive tests have been conducted with different
+parameters as the CPU power, the network parameters (bandwidth and latency)
+and with different problem size. The relative gains greater than $1$ between the
+two algorithms have been captured after each step of the test. In
+Table~\ref{tab:08} are reported the best grid configurations allowing
+the two-stage multisplitting algorithm to be more than $2.5$ times faster than the
+classical GMRES. These experiments also show the relative tolerance of the
+multisplitting algorithm when using a low speed network as usually observed with
+geographically distant clusters through the internet.
-\section{Conclusion}
+\section{Conclusion}
In this paper we have presented the simulation of the execution of three
-different parallel solvers on some multi-core architectures. We have show that
+different parallel solvers on some multi-core architectures. We have shown that
the SimGrid toolkit is an interesting simulation tool that has allowed us to
determine which method to choose given a specified multi-core architecture.
Moreover the simulated results are in accordance (i.e. with the same order of
In future works, we plan to investigate how to simulate the behavior of really
large scale applications. For example, if we are interested to simulate the
execution of the solvers of this paper with thousand or even dozens of thousands
-or core, it is not possible to do that with SimGrid. In fact, this tool will
+of cores, it is not possible to do that with SimGrid. In fact, this tool will
make the real computation. So we plan to focus our research on that problematic.