In order to test and validate our approaches, we wrote a simulator
using the SimGrid
-framework~\cite{casanova+legrand+quinson.2008.simgrid}. This
+framework~\cite{simgrid.web,casanova+legrand+quinson.2008.simgrid}. This
simulator, which consists of about 2,700 lines of C++, allows to run
the different load-balancing strategies under various parameters, such
as the initial distribution of load, the interconnection topology, the
\label{sec.results}
In this section, the results for the different simulations will be presented,
-and we'll try to explain our observations.
+and we will try to explain our observations.
\subsubsection{Cluster vs grid platforms}
\FIXME{annoncer le plan de la suite}
-\subsubsection{The \besteffort{} strategy with the load initially on only one
- node}
+\subsubsection{The \besteffort{} and \makhoul{} strategies without virtual load}
-Before looking at the different variations, we'll first show that the plain
-\besteffort{} strategy is valuable, and may be as good as the \makhoul{}
-strategy. On the graphs from the figure~\ref{fig.results1}, these strategies
-are respectively labeled ``b'' and ``a''.
+Before looking at the different variations, we will first show that the plain
+\besteffort{} strategy is valuable, and may be as good as the \makhoul{}
+strategy. On Figures~\ref{fig.results1} and~\ref{fig.resultsN},
+these strategies are respectively labeled ``b'' and ``a''.
-twice faster on lines
-almost equivalent on torus
-worse on hcubes
+We can see that the relative performance of these strategies is mainly
+influenced by the application topology. It is for the line topology that the
+difference is the more important. In this case, the \besteffort{} strategy is
+nearly faster than the \makhoul{} strategy. This can be explained by the
+fact that the \besteffort{} strategy tries to distribute the load fairly between
+all the nodes and with the line topology, it is easy to load balance the load
+fairly.
--> interconnection
+On the contrary, for the hypercube topology, the \besteffort{} strategy performs
+worse than the \makhoul{} strategy. In this case, the \makhoul{} strategy which
+tries to give more load to few neighbors reaches the equilibrium faster.
-plus c'est connecté, moins bon est BE car à vouloir trop bien équilibrer
-localement, le processeurs se perturbent mutuellement. Du coup, makhoul qui
-équilibre moins bien localement est moins perturbé par ces interférences.
+For the torus topology, for which the number of links is between the line and
+the hypercube, the \makhoul{} strategy is slightly better but the difference is
+more nuanced when the initial load is only on one node. The only case where the
+\makhoul{} strategy is really faster than the \besteffort{} strategy is with the
+random initial distribution when the communication are slow.
-\subsubsection{With the virtual load extension with the load initially on only
- one node}
+Globally the number of interconnection is very important. The more
+the interconnection links are, the faster the \makhoul{} strategy is because
+it distributes quickly significant amount of load, even if this is unfair, between
+all the neighbors. In opposition, the \besteffort{} strategy distributes the
+load fairly so this strategy is better for low connected strategy.
-Dans ce cas légère amélioration de la cvg. max. Temps moyen de cvg. amélioré,
-mais plus de temps passé en idle, surtout quand les comms coutent cher.
-\subsubsection{The \besteffort{} strategy with an initial random load
- distribution, and larger platforms}
+\subsubsection{Virtual load}
-Mêmes conclusions pour line et hcube.
-Sur tore, BE se fait exploser quand les comms coutent cher.
+The influence of virtual load is most of the time really significant compared to
+the same configuration without it. Sometimes it has no effect but {\bf A
+ VERIFIER} it has never a negative effect on the load balancing we tested.
-\FIXME{virer les 1024 ?}
+On Figure~\ref{fig.results1}, when the load is initially on one node, it can be
+noticed that the average idle times are generally longer with the virtual load
+than without it. This can be explained by the fact that, with virtual load,
+processors will exchange all the load they need to exchange as soon as the
+virtual load has been balanced between all the processors. So consequently they
+cannot compute at the beginning. This is especially noticeable when the
+communication are slow (on the left part of Figure ~\ref{fig.results1}.
-\subsubsection{With the virtual load extension with an initial random load
- distribution}
+%Dans ce cas légère amélioration de la cvg. max. Temps moyen de cvg. amélioré,
+%mais plus de temps passé en idle, surtout quand les comms coutent cher.
-Soit c'est équivalent, soit on gagne -> surtout quand les comms coutent cher et
-qu'il y a beaucoup de voisins.
+%\subsubsection{The \besteffort{} strategy with an initial random load
+% distribution, and larger platforms}
+
+%In
+%Mêmes conclusions pour line et hcube.
+%Sur tore, BE se fait exploser quand les comms coutent cher.
+
+%\FIXME{virer les 1024 ?}
+
+%\subsubsection{With the virtual load extension with an initial random load
+% distribution}
+
+%Soit c'est équivalent, soit on gagne -> surtout quand les comms coutent cher et
+%qu'il y a beaucoup de voisins.
\subsubsection{The $k$ parameter}
+\label{results-k}
+
+As explained previously when the communication are slow the \besteffort{}
+strategy is efficient. This is due to the fact that it tries to balance the load
+fairly and consequently a significant amount of the load is transfered between
+processors. In this situation, it is possible to reduce the convergence time by
+using the leveler parameter (parameter $k$). The advantage of using this
+solution is particularly efficient when the initial load is randomly distributed
+on the nodes with torus and hypercube topology and slow communication. When
+virtual load mechanism is used, the effect of this parameter is also visible
+with the same condition.
+
+
+
+\subsubsection{With integer load}
-Dans le cas où les comms coutent cher et ou BE se fait avoir, on peut ameliorer
-les perfs avec le param k.
+We also performed some experiments with integer load instead of load with real
+value. In this case, the results have globally the same behavior. The most
+intereting result, from our point of view, is that the virtual mode allows
+processors in a line topology to converge to the uniform load balancing. Without
+the virtual load, most of the time, processors converge to what we call the
+``stairway effect'', that is to say that there is only a difference of one in
+the load of each processor and its neighbors (for example with 10 processors, we
+obtain 10 9 8 7 6 6 7 8 9 10 instead of 8 8 8 8 8 8 8 8 8 8).
-\subsubsection{With integer load, 1 ou N}
+%Cas normal, ligne -> converge pas (effet d'escalier).
+%Avec vload, ça converge.
-Cas normal, ligne -> converge pas (effet d'escalier).
-Avec vload, ça converge.
+%Dans les autres cas, résultats similaires au cas réel: redire que vload est
+%intéressant.
-Dans les autres cas, résultats similaires au cas réel: redire que vload est
-intéressant.
+\FIXME{ajouter une courbe avec l'équilibrage en entier}
\FIXME{virer la metrique volume de comms}
% LocalWords: SimGrid DASUD Comté asynchronism ji ik isend irecv Cortés et al
% LocalWords: chan ctrl fifo Makhoul GFlop xml pre FEMTO Makhoul's fca bdee
% LocalWords: cdde Contassot Vivier underlaid du de Maréchal Juin cedex calcul
-% LocalWords: biblio
+% LocalWords: biblio Institut UMR Université UFC Centre Scientifique CNRS des
+% LocalWords: École Nationale Supérieure Mécanique Microtechniques ENSMM UTBM
+% LocalWords: Technologie Bahi