X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/loba-papers.git/blobdiff_plain/e3080a36717b6b9710daaec8c4d1e529b19c3176..50f790a50dfec79e4aeabbe564c4c7d375c7d57f:/supercomp11/supercomp11.tex diff --git a/supercomp11/supercomp11.tex b/supercomp11/supercomp11.tex index 7daaa52..4a7e57a 100644 --- a/supercomp11/supercomp11.tex +++ b/supercomp11/supercomp11.tex @@ -1,5 +1,8 @@ - \documentclass[smallextended]{svjour3} +\usepackage[utf8]{inputenc} +\usepackage[T1]{fontenc} +\usepackage{mathptmx} +\usepackage{courier} \usepackage{graphicx} \begin{document} @@ -28,32 +31,177 @@ \begin{abstract} Most of the time, asynchronous load balancing algorithms have extensively been -studied in a theoretical point of view. The Bertsekas' algorithm is certainly -the most well known algorithm for which the convergence proof is given. From a -practical point of view, when a node wants to balance a part of its load to some -of its neighbors, the strategy is not described. In this paper, we propose a -strategy called \texttt{best effort} which tries to balance the load of a node -to all its less loaded neighbors while ensuring that all the nodes concerned by -the load balancing phase have the same amount of load. Moreover, asynchronous -iterative algorithms in which an asynchronous load balancing algorithm is -implemented most of the time can dissociate messages concerning load transfers -and message concerning load information. In order to increase the converge of a -load balancing algorithm, we propose a simple heuristic called \texttt{virtual - load} which allows a node that receives an load information message to -integrate the load that it will receive latter in its load (virtually) and -consequently sends a (real) part of its load to some of its neighbors. In order -to validate our approaches, we have defined a simulator based on SimGrid which -allowed us to conduct many experiments. +studied in a theoretical point of view. The Bertsekas and Tsitsiklis' +algorithm~\cite[section~7.4]{bertsekas+tsitsiklis.1997.parallel} +is certainly the most well known algorithm for which the convergence proof is +given. From a practical point of view, when a node wants to balance a part of +its load to some of its neighbors, the strategy is not described. In this +paper, we propose a strategy called \texttt{best effort} which tries to balance +the load of a node to all its less loaded neighbors while ensuring that all the +nodes concerned by the load balancing phase have the same amount of load. +Moreover, asynchronous iterative algorithms in which an asynchronous load +balancing algorithm is implemented most of the time can dissociate messages +concerning load transfers and message concerning load information. In order to +increase the converge of a load balancing algorithm, we propose a simple +heuristic called \texttt{virtual load} which allows a node that receives an load +information message to integrate the load that it will receive later in its +load (virtually) and consequently sends a (real) part of its load to some of its +neighbors. In order to validate our approaches, we have defined a simulator +based on SimGrid which allowed us to conduct many experiments. \end{abstract} +\section{Introduction} + +Load balancing algorithms are extensively used in parallel and distributed +applications in order to reduce the execution times. They can be applied in +different scientific fields from high performance computation to micro sensor +networks. They are iterative by nature. In literature many kinds of load +balancing algorithms have been studied. They can be classified according +different criteria: centralized or decentralized, in static or dynamic +environment, with homogeneous or heterogeneous load, using synchronous or +asynchronous iterations, with a static topology or a dynamic one which evolves +during time. In this work, we focus on asynchronous load balancing algorithms +where computer nodes are considered homogeneous and with homogeneous load with +no external load. In this context, Bertsekas and Tsitsiklis have proposed an +algorithm which is definitively a reference for many works. In their work, they +proved that under classical hypotheses of asynchronous iterative algorithms and +a special constraint avoiding \texttt{ping-pong} effect, an asynchronous +iterative algorithm converge to the uniform load distribution. This work has +been extended by many authors. For example, +DASUD~\cite{cortes+ripoll+cedo+al.2002.asynchronous} propose a version working +with integer load. {\bf Rajouter des choses ici}. + +Although the Bertsekas and Tsitsiklis' algorithm describes the condition to +ensure the convergence, there is no indication or strategy to really implement +the load distribution. In other word, a node can send a part of its load to one +or many of its neighbors while all the convergence conditions are +followed. Consequently, we propose a new strategy called \texttt{best effort} +that tries to balance the load of a node to all its less loaded neighbors while +ensuring that all the nodes concerned by the load balancing phase have the same +amount of load. Moreover, when real asynchronous applications are considered, +using asynchronous load balancing algorithms can reduce the execution +times. Most of the times, it is simpler to distinguish load information messages +from data migration messages. Formers ones allows a node to inform its +neighbors of its current load. These messages are very small, they can be sent +quite often. For example, if an computing iteration takes a significant times +(ranging from seconds to minutes), it is possible to send a new load information +message at each neighbor at each iteration. Latter messages contains data that +migrates from one node to another one. Depending on the application, it may have +sense or not that nodes try to balance a part of their load at each computing +iteration. But the time to transfer a load message from a node to another one is +often much nore longer that to time to transfer a load information message. So, +when a node receives the information that later it will receive a data message, +it can take this information into account and it can consider that its new load +is larger. Consequently, it can send a part of it real load to some of its +neighbors if required. We call this trick the \texttt{virtual load} mecanism. + + + +So, in this work, we propose a new strategy for improving the distribution of +the load and a simple but efficient trick that also improves the load +balacing. Moreover, we have conducted many simulations with simgrid in order to +validate our improvements are really efficient. Our simulations consider that in +order to send a message, a latency delays the sending and according to the +network performance and the message size, the time of the reception of the +message also varies. + +In the following of this paper, Section~\ref{BT algo} describes the Bertsekas +and Tsitsiklis' asynchronous load balancing algorithm. Moreover, we present a +possible problem in the convergence conditions. Section~\ref{Best-effort} +presents the best effort strategy which provides an efficient way to reduce the +execution times. In Section~\ref{Virtual load}, the virtual load mecanism is +proposed. Simulations allowed to show that both our approaches are valid using a +quite realistic model detailed in Section~\ref{Simulations}. Finally we give a +conclusion and some perspectives to this work. + + + + +\section{Bertsekas and Tsitsiklis' asynchronous load balancing algorithm} +\label{BT algo} + +In order prove the convergence of asynchronous iterative load balancing +Bertesekas and Tsitsiklis proposed a model +in~\cite{bertsekas+tsitsiklis.1997.parallel}. Here we recall some notations. +Consider that $N={1,...,n}$ processors are connected through a network. +Communication links are represented by a connected undirected graph $G=(N,V)$ +where $V$ is the set of links connecting differents processors. In this work, we +consider that processors are homogeneous for sake of simplicity. It is quite +easy to tackle the heterogeneous case~\cite{ElsMonPre02}. Load of processor $i$ +at time $t$ is represented by $x_i(t)\geq 0$. Let $V(i)$ be the set of +neighbors of processor $i$. Each processor $i$ has an estimate of the load of +each of its neighbors $j \in V(i)$ represented by $x_j^i(t)$. According to +asynchronism and communication delays, this estimate may be outdated. We also +consider that the load is described by a continuous variable. + +When a processor send a part of its load to one or some of its neighbors, the +transfer takes time to be completed. Let $s_{ij}(t)$ be the amount of load that +processor $i$ has transfered to processor $j$ at time $t$ and let $r_{ij}(t)$ be the +amount of load received by processor $j$ from processor $i$ at time $t$. Then +the amount of load of processor $i$ at time $t+1$ is given by: +\begin{equation} +x_i(t+1)=x_i(t)-\sum_{j\in V(i)} s_{ij}(t) + \sum_{j\in V(i)} r_{ji}(t) +\end{equation} + + +\section{Best effort strategy} +\label{Best-effort} + + + +\section{Virtual load} +\label{Virtual load} +\section{Simulations} +\label{Simulations} +\subsection{Simulation model} -qsdqsd +\subsection{Validation of our approaches} +On veut montrer quoi ? : +1) best plus rapide que les autres (simple, makhoul) +2) avantage virtual load + +Est ce qu'on peut trouver des contre exemple? +Topologies variées + + +Simulation avec temps définies assez long et on mesure la qualité avec : volume de calcul effectué, volume de données échangées +Mais aussi simulation avec temps court qui montre que seul best converge + + +Expés avec ratio calcul/comm rapide et lent + +Quelques expés avec charge initiale aléatoire plutot que sur le premier proc + +Cadre processeurs homogènes + +Topologies statiques + +On ne tient pas compte de la vitesse des liens donc on la considère homogène + +Prendre un réseau hétérogène et rendre processeur homogène + +Taille : 10 100 très gros + +\section{Conclusion and perspectives} + + +\bibliographystyle{spmpsci} +\bibliography{biblio} \end{document} + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: t +%%% ispell-local-dictionary: "american" +%%% End: + +% LocalWords: Raphaël Couturier Arnaud Giersch Abderrahmane Sider +% LocalWords: Bertsekas Tsitsiklis SimGrid DASUD