From: RCE Date: Tue, 28 Apr 2015 11:36:24 +0000 (+0200) Subject: RCE : Revue et corrections X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/rce2015.git/commitdiff_plain/d67fae5d5ed04beaa52fa85856990fe1640cbf75?ds=sidebyside;hp=--cc RCE : Revue et corrections --- d67fae5d5ed04beaa52fa85856990fe1640cbf75 diff --git a/paper.tex b/paper.tex index 94bdb91..ce1305d 100644 --- a/paper.tex +++ b/paper.tex @@ -342,7 +342,7 @@ architecture scaling up the input matrix size} \begin{tabular}{r c } \hline Grid & 2x16, 4x8, 4x16 and 8x8\\ %\hline - Network & N2 : bw=1Gbits/s - lat=\np{5E-5} \\ %\hline + Network & N2 : bw=1Gbits/s - lat=5.10$^{-5}$ \\ %\hline Input matrix size & N$_{x}$ x N$_{y}$ x N$_{z}$ =150 x 150 x 150\\ %\hline - & N$_{x}$ x N$_{y}$ x N$_{z}$ =170 x 170 x 170 \\ \hline \end{tabular} @@ -363,7 +363,7 @@ the case for the multisplitting method. \begin{figure} [ht!] \centering \includegraphics[width=100mm]{cluster_x_nodes_nx_150_and_nx_170.pdf} -\caption{Cluster x Nodes NX=150 and NX=170} +\caption{Cluster x Nodes N$_{x}$=150 and N$_{x}$=170} %\label{overflow}} \end{figure} %\end{wrapfigure} @@ -383,9 +383,9 @@ matrix size. \begin{tabular}{r c } \hline Grid & 2x16, 4x8\\ %\hline - Network & N1 : bw=10Gbs-lat=8E-06 \\ %\hline - - & N2 : bw=1Gbs-lat=5E-05 \\ - Input matrix size & N$_{x}$ =150 x 150 x 150\\ \hline \\ + Network & N1 : bw=10Gbs-lat=8.10$^{-6}$ \\ %\hline + - & N2 : bw=1Gbs-lat=5.10$^{-5}$ \\ + Input matrix size & N$_{x}$ x N$_{y}$ x N$_{z}$ =150 x 150 x 150\\ \hline \\ \end{tabular} Table 2 : Clusters x Nodes - Networks N1 x N2 \\ @@ -403,8 +403,8 @@ Table 2 : Clusters x Nodes - Networks N1 x N2 \\ %\end{wrapfigure} The experiments compare the behavior of the algorithms running first on -speed inter- cluster network (N1) and a less performant network (N2). -The figure 2 shows that end users will gain to reduce the execution time +a speed inter- cluster network (N1) and a less performant network (N2). +Figure 4 shows that end users will gain to reduce the execution time for both algorithms in using a grid architecture like 4x16 or 8x8: the performance was increased in a factor of 2. The results depict also that when the network speed drops down, the difference between the execution @@ -418,9 +418,8 @@ times can reach more than 25\%. \hline Grid & 2x16\\ %\hline Network & N1 : bw=1Gbs \\ %\hline - Input matrix size & N$_{x}$ =150 x 150 x 150\\ \hline\\ + Input matrix size & N$_{x}$ x N$_{y}$ x N$_{z}$ =150 x 150 x 150\\ \hline\\ \end{tabular} - Table 3 : Network latency impact \\ \end{footnotesize} @@ -435,12 +434,12 @@ Table 3 : Network latency impact \\ \end{figure} -According the results in table and figure 3, degradation of the network +According the results in table and figure 5, degradation of the network latency from 8.10$^{-6}$ to 6.10$^{-5}$ implies an absolute time increase more than 75\% (resp. 82\%) of the execution for the classical GMRES (resp. multisplitting) algorithm. In addition, it appears that the multisplitting method tolerates more the network latency variation with -a less rate increase. Consequently, in the worst case (lat=6.10$^{-5 +a less rate increase of the execution time. Consequently, in the worst case (lat=6.10$^{-5 }$), the execution time for GMRES is almost the double of the time for the multisplitting, even though, the performance was on the same order of magnitude with a latency of 8.10$^{-6}$.