X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/rce2015.git/blobdiff_plain/ac4df02370b008c7b6b76f2127b30eb5e745f842..f191ea095298626cf15138076b2e26ee4dec9b15:/paper.tex?ds=inline diff --git a/paper.tex b/paper.tex index 1fdc62b..24ddab9 100644 --- a/paper.tex +++ b/paper.tex @@ -549,9 +549,8 @@ Grid architecture & 2$\times$16, 4$\times$8, 4$\times$16 a \end{center} \end{table} -\subsubsection{Simulations for various grid architectures and scaling-up matrix sizes} -\ \\ -% environment +\subsubsection{Simulations for various grid architectures and scaling-up matrix sizes\\} + In this section, we analyze the simulations conducted on various grid configurations and for different sizes of the 3D Poisson problem. The parameters of the network between clusters is fixed to $N2$ (see @@ -570,22 +569,33 @@ The execution times between both algorithms is significant with different grid a \includegraphics[width=100mm]{cluster_x_nodes_nx_150_and_nx_170.pdf} \end{center} \caption{Various grid configurations with the matrix sizes 150$^3$ and 170$^3$} +\LZK{CE, la légende de la Figure 3 est trop large. Remplacer les N$_x\times$N$_y\times$N$_z$ par $Mat1$=150$^3$ et $Mat2$=170$^3$ comme dans la Table 1} \label{fig:01} \end{figure} -\subsubsection{Simulations for two different inter-clusters network speeds \\} - -In this section, the experiments compare the behavior of the algorithms running on a -speeder inter-cluster network (N2) and also on a less performant network (N1) respectively defined in the test conditions Table~\ref{tab:02}. -%\RC{Il faut définir cela avant...} -Figure~\ref{fig:02} shows that end users will reduce the execution time -for both algorithms when using a grid architecture like 4 $\times$ 16 or 8 $\times$ 8: the reduction factor is around $2$. The results depict also that when -the network speed drops down (variation of 12.5\%), the difference between the two Multisplitting algorithms execution times can reach more than 25\%. +\subsubsection{Simulations for two different inter-clusters network speeds\\} +In Figure~\ref{fig:02} we present the execution times of both algorithms to +solve a 3D Poisson problem of size $150^3$ on two different simulated network +$N1$ and $N2$ (see Table~\ref{tab:01}). As previously mentioned, we can see from +this figure that the Krylov two-stage algorithm is sensitive to the number of +clusters (i.e. it is better to have a small number of clusters). However, we can +notice an interesting behavior of the Krylov two-stage algorithm. It is less +sensitive to bad network bandwidth and latency for the inter-clusters links than +the GMRES algorithms. This means that the multisplitting methods are more +efficient for distributed systems with high latency networks. + +%% In this section, the experiments compare the behavior of the algorithms running on a +%% speeder inter-cluster network (N2) and also on a less performant network (N1) respectively defined in the test conditions Table~\ref{tab:02}. +%% %\RC{Il faut définir cela avant...} +%% Figure~\ref{fig:02} shows that end users will reduce the execution time +%% for both algorithms when using a grid architecture like 4 $\times$ 16 or 8 $\times$ 8: the reduction factor is around $2$. The results depict also that when +%% the network speed drops down (variation of 12.5\%), the difference between the two Multisplitting algorithms execution times can reach more than 25\%. \begin{figure}[t] \centering \includegraphics[width=100mm]{cluster_x_nodes_n1_x_n2.pdf} \caption{Various grid configurations with networks $N1$ vs. $N2$} +\LZK{CE, remplacer les ``,'' des décimales par un ``.''} \label{fig:02} \end{figure} @@ -610,8 +620,21 @@ the network speed drops down (variation of 12.5\%), the difference between t -\subsubsection{Network latency impacts on performance} -\ \\ + + + + + + + + + + + + + +\subsubsection{Network latency impacts on performance\\} + \begin{table} [ht!] \centering \begin{tabular}{r c } @@ -639,11 +662,11 @@ network latency from $8.10^{-6}$ to $6.10^{-5}$ implies an absolute time increase of more than $75\%$ (resp. $82\%$) of the execution for the classical GMRES (resp. Krylov multisplitting) algorithm. The execution time factor between the two algorithms varies from 2.2 to 1.5 times with a network latency -decreasing from $8.10^{-6}$ to $6.10^{-5}$. +decreasing from $8.10^{-6}$ to $6.10^{-5}$ second. -\subsubsection{Network bandwidth impacts on performance} -\ \\ +\subsubsection{Network bandwidth impacts on performance\\} + \begin{table} [ht!] \centering \begin{tabular}{r c } @@ -675,8 +698,8 @@ Figure~\ref{fig:04}). However, in this case, the Krylov multisplitting method presents a better performance in the considered bandwidth interval with a gain of $40\%$ which is only around $24\%$ for the classical GMRES. -\subsubsection{Input matrix size impacts on performance} -\ \\ +\subsubsection{Input matrix size impacts on performance\\} + \begin{table} [ht!] \centering \begin{tabular}{r c } @@ -712,7 +735,8 @@ targeted environment for the application deployment when focusing on the problem size scale up. It should be noticed that the same test has been done with the grid 4 $\times$ 8 leading to the same conclusion. -\subsubsection{CPU Power impacts on performance} +\subsubsection{CPU Power impacts on performance\\} + \begin{table} [htbp] \centering