From: Arnaud Giersch Date: Thu, 29 Nov 2012 17:31:45 +0000 (+0100) Subject: wip X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/loba-papers.git/commitdiff_plain/faef9adf00e740ce6fa28da521a29006b1cf6ac5 wip --- diff --git a/supercomp11/supercomp11.tex b/supercomp11/supercomp11.tex index 3a1ec31..3010b1b 100644 --- a/supercomp11/supercomp11.tex +++ b/supercomp11/supercomp11.tex @@ -468,28 +468,87 @@ available at In order to assess the performances of our algorithms, we ran our simulator with various parameters, and extracted several metrics, that -we will describe in this section. Overall, the experiments represent -more than 240 hours of computing time. +we will describe in this section. \paragraph{Load balancing strategies} -We ran the experiments with the \emph{Best effort}, and with the \emph{Makhoul} -strategies. \emph{Best effort} was tested with parameter $k = 1$, $k = 2$, and -$k = 4$. Secondly, each strategy was run in its two variants: with, and without -the management of \emph{virtual load}. Finally, we tested each configuration -with \emph{real}, and with \emph{integer} load. -This gives us as many as 32 different strategies. +Several load balancing strategies were compared. We ran the experiments with +the \emph{Best effort}, and with the \emph{Makhoul} strategies. \emph{Best + effort} was tested with parameter $k = 1$, $k = 2$, and $k = 4$. Secondly, +each strategy was run in its two variants: with, and without the management of +\emph{virtual load}. Finally, we tested each configuration with \emph{real}, +and with \emph{integer} load. + +To summarize the different load balancing strategies, we have: +\begin{description} +\item[\textbf{strategies:}] \emph{Makhoul}, or \emph{Best effort} with $k\in + \{1,2,4\}$ +\item[\textbf{variants:}] with, or without virtual load +\item[\textbf{domain:}] real load, or integer load +\end{description} +% +This gives us as many as $4\times 2\times 2 = 16$ different strategies. + \paragraph{Configurations} + +In order to show the behaviour of the different strategies in different +settings, we simulated the executions on two sorts of platforms. These two +sorts of platforms differ by their underlaid network topology. On the one hand, +we have homogeneous platforms, modeled as a cluster. On the other hand, we have +heterogeneous platforms, modeled as the interconnection of a number of clusters. +The heterogeneous platform descriptions were created by taking a subset of the +Grid'5000 infrastructure\footnote{Grid'5000 is a French large scale experimental + Grid (see \url{https://www.grid5000.fr/}).}, as described in the platform file +\texttt{g5k.xml} distributed with SimGrid. Note that the heterogeneity of the +platform only comes from the network topology. The processor speeds, and +network bandwidths were normalized since our algorithms currently are not aware +of such heterogeneity. We arbitrarily chose to fix the processor speed to +1~GFlop/s, and the network bandwidth to 125~MB/s, with a latency of 50~$\mu$s, +except for the links between geographically distant sites, where the network +bandwidth was fixed to 2.25~GB/s, with a latency of 500~$\mu$s. + +Then we derived each sort of platform with four different number of computing +nodes: 16, 64, 256, and 1024 nodes. + +The distributed processes of the application were then logically organized along +three possible topologies: a line, a torus or an hypercube. We ran tests where +the total load was initially on an only node (at one end for the line topology), +and other tests where the load was initially randomly distributed accross all +the participating nodes. + +For each of the preceding configuration, we finally had to cohose the +computation and communication costs of a load unit. We chose them, such as to +have three different computation over communication cost ratios, and hence model +three different kinds of applications: +\begin{itemize} +\item mainly communicating, with a computation/communication cost ratio of $1/10$; +\item mainly computing, with a computation/communication cost ratio of $10/1$ ; +\item balanced, with a computation/communication cost ratio of $1/1$. +\end{itemize} + +To summarize the various configurations, we have: \begin{description} -\item[\textbf{platforms}] homogeneous (cluster); heterogeneous (subset - of Grid5000) -\item[\textbf{platform size}] platforms with 16, 64, 256, and 1024 nodes -\item[\textbf{topologies}] line; torus; hypercube -\item[\textbf{initial load distribution}] initially on a only node; - initially on all nodes -\item[\textbf{comp/comm ratio}] $10/1$, $1/1$, $1/10$ +\item[\textbf{platforms:}] homogeneous (cluster), or heterogeneous (subset of + Grid'5000) +\item[\textbf{platform sizes:}] platforms with 16, 64, 256, or 1024 nodes +\item[\textbf{process topologies:}] line, torus, or hypercube +\item[\textbf{initial load distribution:}] initially on a only node, or + initially randomly distributed over all nodes +\item[\textbf{computation/communication ratio:}] $10/1$, $1/1$, or $1/10$ \end{description} +% +This gives us as many as $2\times 4\times 3\times 2\times 3 = 144$ different +configurations. +% +Combined with the various load balancing strategies, we had $16\times 144 = +2304$ distinct settings to evaluate. In fact, as it will be shown later, we +didn't run all the strategies, nor all the configurations for the bigger +platforms with 1024 nodes, since to simulations would have run for a too long +time. + +Anyway, all these the experiments represent more than 240 hours of computing +time. \paragraph{Metrics}