X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/loba-papers.git/blobdiff_plain/3e0139b6d08c5073cf7665a95af6cb6ae750c923..bcf6cef9fa1df52f632849a68fc08e41f6163bb1:/supercomp11/supercomp11.tex diff --git a/supercomp11/supercomp11.tex b/supercomp11/supercomp11.tex index 2a6c04b..81ee704 100644 --- a/supercomp11/supercomp11.tex +++ b/supercomp11/supercomp11.tex @@ -75,9 +75,11 @@ algorithm which is definitively a reference for many works. In their work, they proved that under classical hypotheses of asynchronous iterative algorithms and a special constraint avoiding \emph{ping-pong} effect, an asynchronous iterative algorithm converge to the uniform load distribution. This work has -been extended by many authors. For example, -DASUD~\cite{cortes+ripoll+cedo+al.2002.asynchronous} propose a version working -with integer load. {\bf Rajouter des choses ici}. +been extended by many authors. For example, Cortés et al., with +DASUD~\cite{cortes+ripoll+cedo+al.2002.asynchronous}, propose a +version working with integer load. This work was later generalized by +the same authors in \cite{cedo+cortes+ripoll+al.2007.convergence}. +{\bf Rajouter des choses ici}. Although the Bertsekas and Tsitsiklis' algorithm describes the condition to ensure the convergence, there is no indication or strategy to really implement @@ -183,12 +185,13 @@ condition or with a weaker condition. \section{Best effort strategy} \label{Best-effort} -We will describe here a new load-balancing strategy that we called -\emph{best effort}. The general idea behind this strategy is, for a -processor, to send some load to the most of its neighbors, doing its +In this section we describe a new load-balancing strategy that we call +\emph{best effort}. The general idea behind this strategy is that each +processor, that detects it has more load than some of its neighbors, +sends some load to the most of its less loaded neighbors, doing its best to reach the equilibrium between those neighbors and himself. -More precisely, when a processors $i$ is in its load-balancing phase, +More precisely, when a processor $i$ is in its load-balancing phase, he proceeds as following. \begin{enumerate} \item First, the neighbors are sorted in non-decreasing order of their @@ -270,69 +273,69 @@ C'est l'algorithme~2 dans~\cite{bahi+giersch+makhoul.2008.scalable}. In order to test and validate our approaches, we wrote a simulator using the SimGrid -framework~\cite{casanova+legrand+quinson.2008.simgrid}. The process -model is detailed in the next section (\ref{Sim model}), then the -results of the simulations are presented in section~\ref{Results}. +framework~\cite{casanova+legrand+quinson.2008.simgrid}. This +simulator, which consists of about 2,700 lines of C++, allows to run +the different load-balancing strategies under various parameters, such +as the initial distribution of load, the interconnection topology, the +characteristics of the running platform, etc. Then several metrics +are issued that permit to compare the strategies. + +The simulation model is detailed in the next section (\ref{Sim + model}), then the results of the simulations are presented in +section~\ref{Results}. \subsection{Simulation model} \label{Sim model} -\begin{verbatim} -Communications -============== - -There are two receiving channels per host: control for information -messages, and data for load transfers. - -Process model -============= - -Each process is made of 3 threads: a receiver thread, a computing -thread, and a load-balancer thread. - -* Receiver thread - --------------- - - Loop - | wait for a message to come, either on data channel, or on ctrl channel - | push received message in a buffer of received messages - | -> ctrl messages on the one side - | -> data messages on the other side - +- - - The loop terminates when a "finalize" message is received on each - channel. - -* Computing thread - ---------------- - - Loop - | if we received some real load, get it (data messages) - | if there is some real load to send, send it - | if we own some load, simulate some computing on it - | sleep a bit if we are looping too fast - +- - send CLOSE on data for all neighbors - wait for CLOSE on data from all neighbors - - The loop terminates when process::still_running() returns false. - (read the source for full details...) - -* Load-balancing thread - --------------------- - - Loop - | call load-balancing algorithm - | send ctrl messages - | sleep (min_lb_iter_duration) - | receive ctrl messages - +- - send CLOSE on ctrl for all neighbors - wait for CLOSE on ctrl from all neighbors +In the simulation model the processors exchange messages which are of +two kinds. First, there are \emph{control messages} which only carry +information that is exchanged between the processors, such as the +current load, or the virtual load transfers if this option is +selected. These messages are rather small, and their size is +constant. Then, there are \emph{data messages} that carry the real +load transferred between the processors. The size of a data message +is a function of the amount of load that it carries, and it can be +pretty large. In order to receive the messages, each processor has +two receiving channels, one for each kind of messages. Finally, when +a message is sent or received, this is done by using the non-blocking +primitives of SimGrid\footnote{That are \texttt{MSG\_task\_isend()}, + and \texttt{MSG\_task\_irecv()}.}. + +During the simulation, each processor concurrently runs three threads: +a \emph{receiving thread}, a \emph{computing thread}, and a +\emph{load-balancing thread}, which we will briefly describe now. + +\paragraph{Receiving thread} The receiving thread is in charge of +waiting for messages to come, either on the control channel, or on the +data channel. When a message is received, it is pushed in a buffer of +received message, to be later consumed by one of the other threads. +There are two such buffers, one for the control messages, and one for +the data messages. The buffers are implemented with a lock-free FIFO +\cite{sutter.2008.writing} to avoid contention between the threads. + +\paragraph{Computing thread} The computing thread is in charge of the +real load management. It iteratively runs the following operations: +\begin{itemize} +\item if some load was received from the neighbors, get it; +\item if there is some load to send to the neighbors, send it; +\item run some computation, whose duration is function of the current + load of the processor. +\end{itemize} +Practically, after the computation, the computing thread waits for a +small amount of time if the iterations are looping too fast (for +example, when the current load is zero). - The loop terminates when process::still_running() returns false. - (read the source for full details...) -\end{verbatim} +\paragraph{Load-balancing thread} The load-balancing thread is in +charge of running the load-balancing algorithm, and exchange the +control messages. It iteratively runs the following operations: +\begin{itemize} +\item get the control messages that were received from the neighbors; +\item run the load-balancing algorithm; +\item send control messages to the neighbors, to inform them of the + processor's current load, and possibly of virtual load transfers; +\item wait a minimum (configurable) amount of time, to avoid to + iterate too fast. +\end{itemize} \subsection{Validation of our approaches} \label{Results} @@ -381,4 +384,4 @@ Taille : 10 100 très gros % LocalWords: Raphaël Couturier Arnaud Giersch Abderrahmane Sider Franche ij % LocalWords: Bertsekas Tsitsiklis SimGrid DASUD Comté Béjaïa asynchronism ji -% LocalWords: ik +% LocalWords: ik isend irecv