X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/loba-papers.git/blobdiff_plain/66aff7d1fa57fdba5f70f086fd0ca55052fa57e8..7f3a86f1595151f314547e4d7166f7d51df418f0:/supercomp11/supercomp11.tex?ds=inline

diff --git a/supercomp11/supercomp11.tex b/supercomp11/supercomp11.tex
index 93a9098..36964d1 100644
--- a/supercomp11/supercomp11.tex
+++ b/supercomp11/supercomp11.tex
@@ -364,13 +364,25 @@ During the simulation, each processor concurrently runs three threads:
 a \emph{receiving thread}, a \emph{computing thread}, and a
 \emph{load-balancing thread}, which we will briefly describe now.
 
-\paragraph{Receiving thread} The receiving thread is in charge of
-waiting for messages to come, either on the control channel, or on the
-data channel.  Its behavior is sketched by Algorithm~\ref{algo.recv}.
-When a message is received, it is pushed in a buffer of
-received message, to be later consumed by one of the other threads.
-There are two such buffers, one for the control messages, and one for
-the data messages.  The buffers are implemented with a lock-free FIFO
+For the sake of simplicity, a few details were voluntary omitted from
+these descriptions.  For an exhaustive presentation, we refer to the
+actual source code that was used for the experiments%
+\footnote{As mentioned before, our simulator relies on the SimGrid
+  framework~\cite{casanova+legrand+quinson.2008.simgrid}.  For the
+  experiments, we used a pre-release of SimGrid 3.7 (Git commit
+  67d62fca5bdee96f590c942b50021cdde5ce0c07, available from
+  \url{https://gforge.inria.fr/scm/?group_id=12})}, and which is
+available at
+\url{http://info.iut-bm.univ-fcomte.fr/staff/giersch/software/loba.tar.gz}.
+
+\subsubsection{Receiving thread}
+
+The receiving thread is in charge of waiting for messages to come, either on the
+control channel, or on the data channel.  Its behavior is sketched by
+Algorithm~\ref{algo.recv}.  When a message is received, it is pushed in a buffer
+of received message, to be later consumed by one of the other threads.  There
+are two such buffers, one for the control messages, and one for the data
+messages.  The buffers are implemented with a lock-free FIFO
 \cite{sutter.2008.writing} to avoid contention between the threads.
 
 \begin{algorithm}
@@ -395,9 +407,10 @@ the data messages.  The buffers are implemented with a lock-free FIFO
   }
 \end{algorithm}
 
-\paragraph{Computing thread} The computing thread is in charge of the
-real load management.  As exposed in Algorithm~\ref{algo.comp}, it
-iteratively runs the following operations:
+\subsubsection{Computing thread}
+
+The computing thread is in charge of the real load management.  As exposed in
+Algorithm~\ref{algo.comp}, it iteratively runs the following operations:
 \begin{itemize}
 \item if some load was received from the neighbors, get it;
 \item if there is some load to send to the neighbors, send it;
@@ -436,10 +449,11 @@ example, when the current load is near zero).
   }
 \end{algorithm}
 
-\paragraph{Load-balancing thread} The load-balancing thread is in
-charge of running the load-balancing algorithm, and exchange the
-control messages.  As shown in Algorithm~\ref{algo.lb}, it iteratively
-runs the following operations:
+\subsubsection{Load-balancing thread}
+
+The load-balancing thread is in charge of running the load-balancing algorithm,
+and exchange the control messages.  As shown in Algorithm~\ref{algo.lb}, it
+iteratively runs the following operations:
 \begin{itemize}
 \item get the control messages that were received from the neighbors;
 \item run the load-balancing algorithm;
@@ -466,19 +480,7 @@ runs the following operations:
   }
 \end{algorithm}
 
-\paragraph{}
-For the sake of simplicity, a few details were voluntary omitted from
-these descriptions.  For an exhaustive presentation, we refer to the
-actual source code that was used for the experiments%
-\footnote{As mentioned before, our simulator relies on the SimGrid
-  framework~\cite{casanova+legrand+quinson.2008.simgrid}.  For the
-  experiments, we used a pre-release of SimGrid 3.7 (Git commit
-  67d62fca5bdee96f590c942b50021cdde5ce0c07, available from
-  \url{https://gforge.inria.fr/scm/?group_id=12})}, and which is
-available at
-\url{http://info.iut-bm.univ-fcomte.fr/staff/giersch/software/loba.tar.gz}.
-
-\FIXME{ajouter des dÃ©tails sur la gestion de la charge virtuelle ?
+\paragraph{}\FIXME{ajouter des dÃ©tails sur la gestion de la charge virtuelle ?
 par ex, donner l'idÃ©e gÃ©nÃ©rale de l'implÃ©mentation.  l'idÃ©e gÃ©nÃ©rale est dÃ©ja dÃ©crite en section~\ref{Virtual load}}
 
 \subsection{Experimental contexts}
@@ -488,7 +490,7 @@ In order to assess the performances of our algorithms, we ran our
 simulator with various parameters, and extracted several metrics, that
 we will describe in this section.
 
-\paragraph{Load balancing strategies}
+\subsubsection{Load balancing strategies}
 
 Several load balancing strategies were compared.  We ran the experiments with
 the \emph{Best effort}, and with the \emph{Makhoul} strategies.  \emph{Best
@@ -507,7 +509,7 @@ To summarize the different load balancing strategies, we have:
 %
 This gives us as many as $4\times 2\times 2 = 16$ different strategies.
 
-\paragraph{End of the simulation}
+\subsubsection{End of the simulation}
 
 The simulations were run until the load was nearly balanced among the
 participating nodes.  More precisely the simulation stops when each node holds
@@ -520,7 +522,7 @@ real application we would have chosen a decentralized convergence detection
 algorithm, like the one described by Bahi, Contassot-Vivier, Couturier, and
 Vernier in \cite{10.1109/TPDS.2005.2}.
 
-\paragraph{Platforms}
+\subsubsection{Platforms}
 
 In order to show the behavior of the different strategies in different
 settings, we simulated the executions on two sorts of platforms.  These two
@@ -546,7 +548,7 @@ processor speeds were normalized, and we arbitrarily chose to fix them to
 Then we derived each sort of platform with four different number of computing
 nodes: 16, 64, 256, and 1024 nodes.
 
-\paragraph{Configurations}
+\subsubsection{Configurations}
 
 The distributed processes of the application were then logically organized along
 three possible topologies: a line, a torus or an hypercube.  We ran tests where
@@ -589,7 +591,7 @@ time.
 Anyway, all these the experiments represent more than 240 hours of computing
 time.
 
-\paragraph{Metrics}
+\subsubsection{Metrics}
 
 In order to evaluate and compare the different load balancing strategies we had
 to define several metrics.  Our goal, when choosing these metrics, was to have