X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/loba-papers.git/blobdiff_plain/50f790a50dfec79e4aeabbe564c4c7d375c7d57f..ed88631cce8e6737ba146dd0a98914baa97ee29c:/supercomp11/supercomp11.tex?ds=inline diff --git a/supercomp11/supercomp11.tex b/supercomp11/supercomp11.tex index 4a7e57a..a566b8a 100644 --- a/supercomp11/supercomp11.tex +++ b/supercomp11/supercomp11.tex @@ -1,3 +1,4 @@ + \documentclass[smallextended]{svjour3} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} @@ -143,13 +144,74 @@ amount of load received by processor $j$ from processor $i$ at time $t$. Then the amount of load of processor $i$ at time $t+1$ is given by: \begin{equation} x_i(t+1)=x_i(t)-\sum_{j\in V(i)} s_{ij}(t) + \sum_{j\in V(i)} r_{ji}(t) +\label{eq:ping-pong} \end{equation} +Some conditions are required to ensure the convergence. One of them can be +called the \texttt{ping-pong} condition which specifies that: +\begin{equation} +x_i(t)-\sum _{k\in V(i)} s_{ik}(t) \geq x_j^i(t)+s_{ij}(t) +\end{equation} +for any processor $i$ and any $j \in V(i)$ such that $x_i(t)>x_j^i(t)$. This +condition aims at avoiding a processor to send a part of its load and beeing +less loaded after that. + +Nevertheless, we think that this condition may lead to deadlocks in some +cases. For example, if we consider only three processors and that processor $1$ +is linked to processor $2$ which is also linked to processor $3$ (i.e. a simple +chain wich 3 processors). Now consider we have the following values at time $t$: +\begin{eqnarray*} +x_1(t)=10 \\ +x_2(t)=100 \\ +x_3(t)=99.99\\ + x_3^2(t)=99.99\\ +\end{eqnarray*} +In this case, processor $2$ can either sends load to processor $1$ or processor +$3$. If it sends load to processor $1$ it will not satisfy condition +(\ref{eq:ping-pong}) because after the sending it will be less loaded that +$x_3^2(t)$. So we consider that the \texttt{ping-pong} condition is probably to +strong. Currently, we did not try to make another convergence proof without this +condition or with a weaker condition. + + \section{Best effort strategy} \label{Best-effort} +\textbf{À traduire} Ordonne les voisins du moins chargé au plus chargé. +Trouve ensuite, en les prenant dans ce ordre, le nombre maximal de +voisins tels que tous ont une charge inférieure à la moyenne des +charges des voisins sélectionnés, et de soi-même. + +Les transferts de charge sont ensuite fait en visant cette moyenne pour +tous les voisins sélectionnés. On envoie une quantité de charge égale +à (moyenne - charge\_du\_voisin). + +~\\\textbf{Question} faut-il décrire les stratégies makhoul et simple ? +\paragraph{simple} Tentative de respecter simplement les conditions de Bertsekas. +Parmi les voisins moins chargés que soi, on sélectionne : +\begin{itemize} +\item un des moins chargés (vmin) ; +\item un des plus chargés (vmax), +\end{itemize} +puis on équilibre avec vmin en s'assurant que notre charge reste +toujours supérieure à celle de vmin et à celle de vmax. + +On envoie donc (avec "self" pour soi-même) : +\[ + \min\left(\frac{load(self) - load(vmin)}{2}, load(self) - load(vmax)\right) +\] + +\paragraph{makhoul} Ordonne les voisins du moins chargé au plus chargé +puis calcule les différences de charge entre soi-même et chacun des +voisins. + +Ensuite, pour chaque voisin, dans l'ordre, et tant qu'on reste plus +chargé que le voisin en question, on lui envoie 1/(N+1) de la +différence calculée au départ, avec N le nombre de voisins. + +C'est l'algorithme~2 dans~\cite{bahi+giersch+makhoul.2008.scalable}. \section{Virtual load} \label{Virtual load} @@ -157,9 +219,74 @@ x_i(t+1)=x_i(t)-\sum_{j\in V(i)} s_{ij}(t) + \sum_{j\in V(i)} r_{ji}(t) \section{Simulations} \label{Simulations} +In order to test and validate our approaches, we wrote a simulator +using the SimGrid +framework~\cite{casanova+legrand+quinson.2008.simgrid}. The process +model is detailed in the next section (\ref{Sim model}), then the +results of the simulations are presented in section~\ref{Results}. + \subsection{Simulation model} +\label{Sim model} + +\begin{verbatim} +Communications +============== + +There are two receiving channels per host: control for information +messages, and data for load transfers. + +Process model +============= + +Each process is made of 3 threads: a receiver thread, a computing +thread, and a load-balancer thread. + +* Receiver thread + --------------- + + Loop + | wait for a message to come, either on data channel, or on ctrl channel + | push received message in a buffer of received messages + | -> ctrl messages on the one side + | -> data messages on the other side + +- + + The loop terminates when a "finalize" message is received on each + channel. + +* Computing thread + ---------------- + + Loop + | if we received some real load, get it (data messages) + | if there is some real load to send, send it + | if we own some load, simulate some computing on it + | sleep a bit if we are looping too fast + +- + send CLOSE on data for all neighbors + wait for CLOSE on data from all neighbors + + The loop terminates when process::still_running() returns false. + (read the source for full details...) + +* Load-balancing thread + --------------------- + + Loop + | call load-balancing algorithm + | send ctrl messages + | sleep (min_lb_iter_duration) + | receive ctrl messages + +- + send CLOSE on ctrl for all neighbors + wait for CLOSE on ctrl from all neighbors + + The loop terminates when process::still_running() returns false. + (read the source for full details...) +\end{verbatim} \subsection{Validation of our approaches} +\label{Results} On veut montrer quoi ? :