X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/loba-papers.git/blobdiff_plain/1d15a8ec249a5c61c46650801704389dcb00dbf6..59636878becb2087601db617b593fd3f35181655:/loba-besteffort/loba-besteffort.tex?ds=inline diff --git a/loba-besteffort/loba-besteffort.tex b/loba-besteffort/loba-besteffort.tex index 36b1fbe..3ccb2b0 100644 --- a/loba-besteffort/loba-besteffort.tex +++ b/loba-besteffort/loba-besteffort.tex @@ -45,8 +45,13 @@ \author{Arnaud Giersch\corref{cor}} \ead{arnaud.giersch@femto-st.fr} -\address{FEMTO-ST, University of Franche-Comté\\ - 19 avenue du Maréchal Juin, BP 527, 90016 Belfort cedex, France} +\address{% + Institut FEMTO-ST (UMR 6174), + Université de Franche-Comté (UFC), + Centre National de la Recherche Scientifique (CNRS), + École Nationale Supérieure de Mécanique et des Microtechniques (ENSMM), + Université de Technologie de Belfort Montbéliard (UTBM)\\ + 19 avenue du Maréchal Juin, BP 527, 90016 Belfort cedex, France} \cortext[cor]{Corresponding author.} @@ -342,7 +347,7 @@ information of the load they will receive, so they can take in into account. In order to test and validate our approaches, we wrote a simulator using the SimGrid -framework~\cite{casanova+legrand+quinson.2008.simgrid}. This +framework~\cite{simgrid.web,casanova+legrand+quinson.2008.simgrid}. This simulator, which consists of about 2,700 lines of C++, allows to run the different load-balancing strategies under various parameters, such as the initial distribution of load, the interconnection topology, the @@ -648,7 +653,7 @@ With these constraints in mind, we defined the following metrics: \label{sec.results} In this section, the results for the different simulations will be presented, -and we'll try to explain our observations. +and we will try to explain our observations. \subsubsection{Cluster vs grid platforms} @@ -719,45 +724,121 @@ allocated time, or because we simply decided not to run it. \FIXME{annoncer le plan de la suite} -\subsubsection{The \besteffort{} strategy} +\subsubsection{The \besteffort{} and \makhoul{} strategies without virtual load} -Looking at the graph on figure~\ref{fig.results1}, we can see that the -\besteffort{} strategy is not too bad, compared to the \makhoul{} strategy. +Before looking at the different variations, we will first show that the plain +\besteffort{} strategy is valuable, and may be as good as the \makhoul{} +strategy. On Figures~\ref{fig.results1} and~\ref{fig.resultsN}, +these strategies are respectively labeled ``b'' and ``a''. -\FIXME{donner les premières conclusions} -\FIXME{comparer be/makhoul -> be tient la route (parler du cas réel uniquement)} +We can see that the relative performance of these strategies is mainly +influenced by the application topology. It is for the line topology that the +difference is the more important. In this case, the \besteffort{} strategy is +nearly faster than the \makhoul{} strategy. This can be explained by the +fact that the \besteffort{} strategy tries to distribute the load fairly between +all the nodes and with the line topology, it is easy to load balance the load +fairly. -\subsubsection{With the virtual load extension} +On the contrary, for the hypercube topology, the \besteffort{} strategy performs +worse than the \makhoul{} strategy. In this case, the \makhoul{} strategy which +tries to give more load to few neighbors reaches the equilibrium faster. -\FIXME{valider l'extension virtual load -> c'est 'achement bien} +For the torus topology, for which the number of links is between the line and +the hypercube, the \makhoul{} strategy is slightly better but the difference is +more nuanced when the initial load is only on one node. The only case where the +\makhoul{} strategy is really faster than the \besteffort{} strategy is with the +random initial distribution when the communication are slow. -\subsubsection{The $k$ parameter} +Globally the number of interconnection is very important. The more +the interconnection links are, the faster the \makhoul{} strategy is because +it distributes quickly significant amount of load, even if this is unfair, between +all the neighbors. In opposition, the \besteffort{} strategy distributes the +load fairly so this strategy is better for low connected strategy. -\FIXME{proposer le -k -> ça peut aider dans certains cas} -\subsubsection{With an initial random distribution, and larger platforms} +\subsubsection{Virtual load} -\FIXME{dire quoi ici ?} +The influence of virtual load is most of the time really significant compared to +the same configuration without it. Sometimes it has no effect but {\bf A + VERIFIER} it has never a negative effect on the load balancing we tested. -\subsubsection{With integer load} +On Figure~\ref{fig.results1}, when the load is initially on one node, it can be +noticed that the average idle times are generally longer with the virtual load +than without it. This can be explained by the fact that, with virtual load, +processors will exchange all the load they need to exchange as soon as the +virtual load has been balanced between all the processors. So consequently they +cannot compute at the beginning. This is especially noticeable when the +communication are slow (on the left part of Figure ~\ref{fig.results1}. -\FIXME{conclure avec la version entière -> on n'a pas l'effet d'escalier !} +%Dans ce cas légère amélioration de la cvg. max. Temps moyen de cvg. amélioré, +%mais plus de temps passé en idle, surtout quand les comms coutent cher. -\FIXME{what about the amount of data?} +%\subsubsection{The \besteffort{} strategy with an initial random load +% distribution, and larger platforms} -\FIXME{On constate quoi (vérifier avec les chiffres)? -\begin{itemize} -\item cluster ou grid, entier ou réel, ne font pas de grosses différences -\item bookkeeping? améliore souvent les choses, parfois au prix d'un retard au démarrage -\item makhoul? se fait battre sur les grosses plateformes -\item taille de plateforme? -\item ratio comp/comm? -\item option $k$? peut-être intéressant sur des plateformes fortement interconnectées (hypercube) -\item volume de comm? souvent, besteffort/plain en fait plus. pourquoi? -\item répartition initiale de la charge ? -\item integer mode sur topo. line n'a jamais fini en plain? vérifier si ce n'est - pas à cause de l'effet d'escalier que bk est capable de gommer. -\end{itemize}} +%In +%Mêmes conclusions pour line et hcube. +%Sur tore, BE se fait exploser quand les comms coutent cher. + +%\FIXME{virer les 1024 ?} + +%\subsubsection{With the virtual load extension with an initial random load +% distribution} + +%Soit c'est équivalent, soit on gagne -> surtout quand les comms coutent cher et +%qu'il y a beaucoup de voisins. + +\subsubsection{The $k$ parameter} +\label{results-k} + +As explained previously when the communication are slow the \besteffort{} +strategy is efficient. This is due to the fact that it tries to balance the load +fairly and consequently a significant amount of the load is transfered between +processors. In this situation, it is possible to reduce the convergence time by +using the leveler parameter (parameter $k$). The advantage of using this +solution is particularly efficient when the initial load is randomly distributed +on the nodes with torus and hypercube topology and slow communication. When +virtual load mechanism is used, the effect of this parameter is also visible +with the same condition. + + + +\subsubsection{With integer load} + +We also performed some experiments with integer load instead of load with real +value. In this case, the results have globally the same behavior. The most +intereting result, from our point of view, is that the virtual mode allows +processors in a line topology to converge to the uniform load balancing. Without +the virtual load, most of the time, processors converge to what we call the +``stairway effect'', that is to say that there is only a difference of one in +the load of each processor and its neighbors (for example with 10 processors, we +obtain 10 9 8 7 6 6 7 8 9 10 instead of 8 8 8 8 8 8 8 8 8 8). + +%Cas normal, ligne -> converge pas (effet d'escalier). +%Avec vload, ça converge. + +%Dans les autres cas, résultats similaires au cas réel: redire que vload est +%intéressant. + +\FIXME{ajouter une courbe avec l'équilibrage en entier} + +\FIXME{virer la metrique volume de comms} + +\FIXME{ajouter une courbe ou on voit l'évolution de la charge en fonction du + temps : avec et sans vload} + +% \begin{itemize} +% \item cluster ou grid, entier ou réel, ne font pas de grosses différences +% \item bookkeeping? améliore souvent les choses, parfois au prix d'un retard au démarrage +% \item makhoul? se fait battre sur les grosses plateformes +% \item taille de plateforme? +% \item ratio comp/comm? +% \item option $k$? peut-être intéressant sur des plateformes fortement interconnectées (hypercube) +% \item volume de comm? souvent, besteffort/plain en fait plus. pourquoi? +% \item répartition initiale de la charge ? +% \item integer mode sur topo. line n'a jamais fini en plain? vérifier si ce n'est +% pas à cause de l'effet d'escalier que bk est capable de gommer. +% \end{itemize}} % On veut montrer quoi ? : @@ -811,4 +892,6 @@ Mésocentre de calcul de Franche-Comté. % LocalWords: SimGrid DASUD Comté asynchronism ji ik isend irecv Cortés et al % LocalWords: chan ctrl fifo Makhoul GFlop xml pre FEMTO Makhoul's fca bdee % LocalWords: cdde Contassot Vivier underlaid du de Maréchal Juin cedex calcul -% LocalWords: biblio +% LocalWords: biblio Institut UMR Université UFC Centre Scientifique CNRS des +% LocalWords: École Nationale Supérieure Mécanique Microtechniques ENSMM UTBM +% LocalWords: Technologie Bahi