X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/loba-papers.git/blobdiff_plain/f622eef95fe3973773527fa3260b7563a498ec60..ec185d3c7dd2e42e79b12952d72879db0138f6fd:/supercomp11/supercomp11.tex diff --git a/supercomp11/supercomp11.tex b/supercomp11/supercomp11.tex index db3e809..93a9098 100644 --- a/supercomp11/supercomp11.tex +++ b/supercomp11/supercomp11.tex @@ -14,8 +14,11 @@ \begin{tabular}[t]{@{}l@{:~}l@{}}}{% \end{tabular}} -\newcommand{\FIXME}[1]{% - \textbf{$\triangleright$\marginpar{\textbf{[FIXME]}}~#1}} +\newcommand{\FIXMEmargin}[1]{% + \marginpar{\textbf{[FIXME]} {\footnotesize #1}}} +\newcommand{\FIXME}[2][]{% + \ifx #2\relax\relax \FIXMEmargin{#1}% + \else \textbf{$\triangleright$\FIXMEmargin{#1}~#2}\fi} \newcommand{\VAR}[1]{\textit{#1}} @@ -85,7 +88,7 @@ been extended by many authors. For example, Cortés et al., with DASUD~\cite{cortes+ripoll+cedo+al.2002.asynchronous}, propose a version working with integer load. This work was later generalized by the same authors in \cite{cedo+cortes+ripoll+al.2007.convergence}. -\FIXME{Rajouter des choses ici.} +\FIXME{Rajouter des choses ici. Lesquelles ?} Although the Bertsekas and Tsitsiklis' algorithm describes the condition to ensure the convergence, there is no indication or strategy to really implement @@ -186,20 +189,28 @@ $3$. If it sends load to processor $1$ it will not satisfy condition $x_3^2(t)$. So we consider that the \emph{ping-pong} condition is probably to strong. Currently, we did not try to make another convergence proof without this condition or with a weaker condition. -% -\FIXME{Develop: We have the feeling that such a weaker condition - exists, because (it's not a proof, but) we have never seen any - scenario that is not leading to convergence, even with LB-strategies - that are not fulfilling these two conditions.} + +Nevertheless, we conjecture that such a weaker condition exists. In fact, we +have never seen any scenario that is not leading to convergence, even with +load-balancing strategies that are not exactly fulfilling these two conditions. + +It may be the subject of future work to express weaker conditions, and to prove +that they are sufficient to ensure the convergence of the load-balancing +algorithm. \section{Best effort strategy} \label{Best-effort} -In this section we describe a new load-balancing strategy that we call -\emph{best effort}. The general idea behind this strategy is that each -processor, that detects it has more load than some of its neighbors, -sends some load to the most of its less loaded neighbors, doing its -best to reach the equilibrium between those neighbors and himself. +In this section we describe a new load-balancing strategy that we call +\emph{best effort}. First, we explain the general idea behind this strategy, +and then we describe some variants of this basic strategy. + +\subsection{Basic strategy} + +The general idea behind the \emph{best effort} strategy is that each processor, +that detects it has more load than some of its neighbors, sends some load to the +most of its less loaded neighbors, doing its best to reach the equilibrium +between those neighbors and himself. More precisely, when a processor $i$ is in its load-balancing phase, he proceeds as following. @@ -246,38 +257,44 @@ he proceeds as following. \end{equation*} \end{enumerate} -\FIXME{describe parameter $k$} - -\section{Other strategies} -\label{Other} +\subsection{Leveling the amount to send} -\FIXME{Réécrire en anglais.} +With the aforementioned basic strategy, each node does its best to reach the +equilibrium with its neighbors. Since each node may be taking the same kind of +decision at the same moment, there is the risk that a node receives load from +several of its neighbors, and then is temporary going off the equilibrium state. +This is particularly true with strongly connected applications. -% \FIXME{faut-il décrire les stratégies makhoul et simple ?} +In order to reduce this effect, we add the ability to level the amount to send. +The idea, here, is to make smaller steps toward the equilibrium, such that a +potentially wrong decision has a lower impact. -% \paragraph{simple} Tentative de respecter simplement les conditions de Bertsekas. -% Parmi les voisins moins chargés que soi, on sélectionne : -% \begin{itemize} -% \item un des moins chargés (vmin) ; -% \item un des plus chargés (vmax), -% \end{itemize} -% puis on équilibre avec vmin en s'assurant que notre charge reste -% toujours supérieure à celle de vmin et à celle de vmax. +Concretely, once $s_{ij}$ has been evaluated as before, it is simply divided by +some configurable factor. That's what we named the ``parameter $k$'' in +Section~\ref{Results}. The amount of data to send is then $s_{ij}(t) = (\bar{x} +- x^i_j(t))/k$. +\FIXME[check that it's still named $k$ in Sec.~\ref{Results}]{} -% On envoie donc (avec "self" pour soi-même) : -% \[ -% \min\left(\frac{load(self) - load(vmin)}{2}, load(self) - load(vmax)\right) -% \] +\section{Other strategies} +\label{Other} -\paragraph{makhoul} Ordonne les voisins du moins chargé au plus chargé -puis calcule les différences de charge entre soi-même et chacun des -voisins. +Another load balancing strategy, working under the same conditions, was +previously developed by Bahi, Giersch, and Makhoul in +\cite{bahi+giersch+makhoul.2008.scalable}. In order to assess the performances +of the new \emph{best effort}, we naturally chose to compare it to this anterior +work. More precisely, we will use the algorithm~2 from +\cite{bahi+giersch+makhoul.2008.scalable} and, in the following, we will +reference it under the name of Makhoul's. -Ensuite, pour chaque voisin, dans l'ordre, et tant qu'on reste plus -chargé que le voisin en question, on lui envoie 1/(N+1) de la -différence calculée au départ, avec N le nombre de voisins. +Here is an outline of the Makhoul's algorithm. When a given node needs to take +a load balancing decision, it starts by sorting its neighbors by increasing +order of their load. Then, it computes the difference between its own load, and +the load of each of its neighbors. Finally, taking the neighbors following the +order defined before, the amount of load to send $s_{ij}$ is computed as +$1/(N+1)$ of the load difference, with $N$ being the number of neighbors. This +process continues as long as the node is more loaded than the considered +neighbor. -C'est l'algorithme~2 dans~\cite{bahi+giersch+makhoul.2008.scalable}. \section{Virtual load} \label{Virtual load} @@ -461,7 +478,8 @@ actual source code that was used for the experiments% available at \url{http://info.iut-bm.univ-fcomte.fr/staff/giersch/software/loba.tar.gz}. -\FIXME{ajouter des détails sur la gestion de la charge virtuelle ?} +\FIXME{ajouter des détails sur la gestion de la charge virtuelle ? +par ex, donner l'idée générale de l'implémentation. l'idée générale est déja décrite en section~\ref{Virtual load}} \subsection{Experimental contexts} \label{Contexts} @@ -499,7 +517,8 @@ number of computing iterations (2000 in our case). Note that this convergence detection was implemented in a centralized manner. This is easy to do within the simulator, but it's obviously not realistic. In a real application we would have chosen a decentralized convergence detection -algorithm, like the one described in \cite{10.1109/TPDS.2005.2}. +algorithm, like the one described by Bahi, Contassot-Vivier, Couturier, and +Vernier in \cite{10.1109/TPDS.2005.2}. \paragraph{Platforms} @@ -614,8 +633,54 @@ With these constraints in mind, we defined the following metrics: \subsection{Validation of our approaches} \label{Results} +\begin{figure} + \centering + \includegraphics[width=.5\linewidth]{data/graphs/R1-10:1-cluster-line} + \includegraphics[width=.5\linewidth]{data/graphs/R1-10:1-cluster-torus}% + \hfill%% + \includegraphics[width=.5\linewidth]{data/graphs/R1-10:1-cluster-hcube} + \caption{Results \textbf{[FIXME]}} + \label{fig.results} +\end{figure} + +Dans cet ordre: +... +- comparer be/makhoul -> be tient la route + -> en réel uniquement +- valider l'extension virtual load -> c'est 'achement bien +- proposer le -k -> ça peut aider dans certains cas +- conclure avec la version entière -> on n'a pas l'effet d'escalier ! +Q: comment inclure les types/tailles de platesformes ? +Q: comment faire des moyennes ? +Q: comment introduire les distrib 1/N ? +... + +On constate quoi (vérifier avec les chiffres)? +\begin{itemize} +\item cluster ou grid, entier ou réel, ne font pas de grosses différences + +\item bookkeeping? améliore souvent les choses, parfois au prix d'un retard au démarrage +\item makhoul? se fait battre sur les grosses plateformes + +\item taille de plateforme? + +\item ratio comp/comm? + +\item option $k$? peut-être intéressant sur des plateformes fortement interconnectées (hypercube) + +\item volume de comm? souvent, besteffort/plain en fait plus. pourquoi? + +\item répartition initiale de la charge ? + +\item integer mode sur topo. line n'a jamais fini en plain? vérifier si ce n'est + pas à cause de l'effet d'escalier que bk est capable de gommer. + +\end{itemize} + +\begin{itshape} On veut montrer quoi ? : +\FIXME{remove that part} 1) best plus rapide que les autres (simple, makhoul) 2) avantage virtual load @@ -627,7 +692,6 @@ Topologies variées Simulation avec temps définies assez long et on mesure la qualité avec : volume de calcul effectué, volume de données échangées Mais aussi simulation avec temps court qui montre que seul best converge - Expés avec ratio calcul/comm rapide et lent Quelques expés avec charge initiale aléatoire plutot que sur le premier proc @@ -641,14 +705,18 @@ On ne tient pas compte de la vitesse des liens donc on la considère homogène Prendre un réseau hétérogène et rendre processeur homogène Taille : 10 100 très gros +\end{itshape} \section{Conclusion and perspectives} +\FIXME{conclude!} + \begin{acknowledgements} Computations have been performed on the supercomputer facilities of the Mésocentre de calcul de Franche-Comté. \end{acknowledgements} +\FIXME{find and add more references} \bibliographystyle{spmpsci} \bibliography{biblio} @@ -663,4 +731,5 @@ Taille : 10 100 très gros % LocalWords: Raphaël Couturier Arnaud Giersch Abderrahmane Sider Franche ij % LocalWords: Bertsekas Tsitsiklis SimGrid DASUD Comté Béjaïa asynchronism ji -% LocalWords: ik isend irecv Cortés et al chan ctrl fifo Makhoul GFlop xml +% LocalWords: ik isend irecv Cortés et al chan ctrl fifo Makhoul GFlop xml pre +% LocalWords: FEMTO Makhoul's fca bdee cdde Contassot Vivier underlaid