Describe parameter k.

[loba-papers.git] / supercomp11 / supercomp11.tex
diff --git a/supercomp11/supercomp11.tex b/supercomp11/supercomp11.tex

index a26ad3ac45f4e5cce13675b92b6e21ba2dee3ad7..2f2dd31bc46905d59c46caae3c0a96782fc50c34 100644 (file)
--- a/supercomp11/supercomp11.tex
+++ b/supercomp11/supercomp11.tex
@@ -195,11 +195,16 @@ condition or with a weaker condition.
  \section{Best effort strategy}
  \label{Best-effort}
  
  \section{Best effort strategy}
  \label{Best-effort}
  
-In this section we  describe  a new load-balancing strategy that we call
-\emph{best effort}.  The general idea behind this strategy is that each
-processor, that detects it has more load than some of its neighbors, 
-sends some load to the most of its less loaded neighbors, doing its
-best to reach the equilibrium between those neighbors and himself.
+In this section we describe a new load-balancing strategy that we call
+\emph{best effort}.  First, we explain the general idea behind this strategy,
+and then we describe some variants of this basic strategy.
+
+\subsection{Basic strategy}
+
+The general idea behind the \emph{best effort} strategy is that each processor,
+that detects it has more load than some of its neighbors, sends some load to the
+most of its less loaded neighbors, doing its best to reach the equilibrium
+between those neighbors and himself.
  
  More precisely, when a processor $i$ is in its load-balancing phase,
  he proceeds as following.
  
  More precisely, when a processor $i$ is in its load-balancing phase,
  he proceeds as following.
@@ -246,12 +251,28 @@ he proceeds as following.
    \end{equation*}
  \end{enumerate}
  
    \end{equation*}
  \end{enumerate}
  
-\FIXME{describe parameter $k$}
+\subsection{Leveling the amount to send}
+
+With the aforementioned basic strategy, each node does its best to reach the
+equilibrium with its neighbors.  Since each node may be taking the same kind of
+decision at the same moment, there is the risk that a node receives load from
+several of its neighbors, and then is temporary going off the equilibrium state.
+This is particularly true with strongly connected applications.
+
+In order to reduce this effect, we add the ability to level the amount to send.
+The idea, here, is to make smaller steps toward the equilibrium, such as a
+potentially wrong decision has a lower impact.
+
+Concretely, once $s_{ij}$ has been evaluated as before, it is simply divided by
+some configurable factor.  That's what we named the ``parameter $k$'' in
+Section~\ref{Results}.  The amount of data to send is then $s_{ij}(t) = (\bar{x}
+- x^i_j(t))/k$.
+\FIXME{check the name ($k$)}
  
  \section{Other strategies}
  \label{Other}
  
  
  \section{Other strategies}
  \label{Other}
  
-\FIXME{Réécrire en angliche.}
+\FIXME{Réécrire en anglais.}
  
  % \FIXME{faut-il décrire les stratégies makhoul et simple ?}
  
  
  % \FIXME{faut-il décrire les stratégies makhoul et simple ?}
  
@@ -497,8 +518,9 @@ an amount of load at less than 1\% of the load average, during an arbitrary
  number of computing iterations (2000 in our case).
  
  Note that this convergence detection was implemented in a centralized manner.
  number of computing iterations (2000 in our case).
  
  Note that this convergence detection was implemented in a centralized manner.
-This is easy to do within the simulator, but it's obviously not realistic.  In
-a real application we would have chosen a decentralized convergence detection algorithm, like the one described in \cite{10.1109/TPDS.2005.2}.
+This is easy to do within the simulator, but it's obviously not realistic.  In a
+real application we would have chosen a decentralized convergence detection
+algorithm, like the one described in \cite{10.1109/TPDS.2005.2}.
  
  \paragraph{Platforms}
  
  
  \paragraph{Platforms}
  
@@ -507,16 +529,21 @@ settings, we simulated the executions on two sorts of platforms.  These two
  sorts of platforms differ by their underlaid network topology.  On the one hand,
  we have homogeneous platforms, modeled as a cluster.  On the other hand, we have
  heterogeneous platforms, modeled as the interconnection of a number of clusters.
  sorts of platforms differ by their underlaid network topology.  On the one hand,
  we have homogeneous platforms, modeled as a cluster.  On the other hand, we have
  heterogeneous platforms, modeled as the interconnection of a number of clusters.
+
+The clusters were modeled by a fixed number of computing nodes interconnected
+through a backbone link.  Each computing node has a computing power of
+1~GFlop/s, and is connected to the backbone by a network link whose bandwidth is
+of 125~MB/s, with a latency of 50~$\mu$s.  The backbone has a network bandwidth
+of 2.25~GB/s, with a latency of 500~$\mu$s.
+
  The heterogeneous platform descriptions were created by taking a subset of the
  Grid'5000 infrastructure\footnote{Grid'5000 is a French large scale experimental
    Grid (see \url{https://www.grid5000.fr/}).}, as described in the platform file
  \texttt{g5k.xml} distributed with SimGrid.  Note that the heterogeneity of the
  The heterogeneous platform descriptions were created by taking a subset of the
  Grid'5000 infrastructure\footnote{Grid'5000 is a French large scale experimental
    Grid (see \url{https://www.grid5000.fr/}).}, as described in the platform file
  \texttt{g5k.xml} distributed with SimGrid.  Note that the heterogeneity of the
-platform only comes from the network topology.  The processor speeds, and
-network bandwidths were normalized since our algorithms currently are not aware
-of such heterogeneity.  We arbitrarily chose to fix the processor speed to
-1~GFlop/s, and the network bandwidth to 125~MB/s, with a latency of 50~$\mu$s,
-except for the links between geographically distant sites, where the network
-bandwidth was fixed to 2.25~GB/s, with a latency of 500~$\mu$s.
+platform here only comes from the network topology.  Indeed, since our
+algorithms currently do not handle heterogeneous computing resources, the
+processor speeds were normalized, and we arbitrarily chose to fix them to
+1~GFlop/s.
  
  Then we derived each sort of platform with four different number of computing
  nodes: 16, 64, 256, and 1024 nodes.
  
  Then we derived each sort of platform with four different number of computing
  nodes: 16, 64, 256, and 1024 nodes.
@@ -657,4 +684,4 @@ Taille : 10 100 très gros
  
  % LocalWords:  Raphaël Couturier Arnaud Giersch Abderrahmane Sider Franche ij
  % LocalWords:  Bertsekas Tsitsiklis SimGrid DASUD Comté Béjaïa asynchronism ji
  
  % LocalWords:  Raphaël Couturier Arnaud Giersch Abderrahmane Sider Franche ij
  % LocalWords:  Bertsekas Tsitsiklis SimGrid DASUD Comté Béjaïa asynchronism ji
-% LocalWords:  ik isend irecv Cortés et al chan ctrl fifo Makhoul
+% LocalWords:  ik isend irecv Cortés et al chan ctrl fifo Makhoul GFlop xml