From 77dfff6971a1d64b3274379ad340410840d24980 Mon Sep 17 00:00:00 2001
From: Arnaud Giersch <arnaud.giersch@iut-bm.univ-fcomte.fr>
Date: Wed, 1 Jun 2011 17:07:32 +0200
Subject: [PATCH 1/1] Simulation model.

---
 supercomp11/biblio.bib      |   9 +++
 supercomp11/supercomp11.tex | 118 ++++++++++++++++++------------------
 2 files changed, 68 insertions(+), 59 deletions(-)

diff --git a/supercomp11/biblio.bib b/supercomp11/biblio.bib
index 164500d..ddec786 100644
--- a/supercomp11/biblio.bib
+++ b/supercomp11/biblio.bib
@@ -71,3 +71,12 @@
   pages = "305--320",
   year = 	"2002",
 }
+
+@Article{sutter.2008.writing,
+  author =       {Sutter, Herb},
+  title =        {Writing Lock-Free Code: A Corrected Queue},
+  journal =      {Dr. Dobb's Journal},
+  year =         2008,
+  month =        sep,
+  url =          {http://www.drdobbs.com/high-performance-computing/210604448},
+}
diff --git a/supercomp11/supercomp11.tex b/supercomp11/supercomp11.tex
index 2a6c04b..b0d23b0 100644
--- a/supercomp11/supercomp11.tex
+++ b/supercomp11/supercomp11.tex
@@ -270,69 +270,69 @@ C'est l'algorithme~2 dans~\cite{bahi+giersch+makhoul.2008.scalable}.
 
 In order to test and validate our approaches, we wrote a simulator
 using the SimGrid
-framework~\cite{casanova+legrand+quinson.2008.simgrid}.  The process
-model is detailed in the next section (\ref{Sim model}), then the
-results of the simulations are presented in section~\ref{Results}.
+framework~\cite{casanova+legrand+quinson.2008.simgrid}.  This
+simulator, which consists of about 2,700 lines of C++, allows to run
+the different load-balancing strategies under various parameters, such
+as the initial distribution of load, the interconnection topology, the
+characteristics of the running platform, etc.  Then several metrics
+are issued that permit to compare the strategies.
+
+The simulation model is detailed in the next section (\ref{Sim
+  model}), then the results of the simulations are presented in
+section~\ref{Results}.
 
 \subsection{Simulation model}
 \label{Sim model}
 
-\begin{verbatim}
-Communications
-==============
-
-There are two receiving channels per host: control for information
-messages, and data for load transfers.
-
-Process model
-=============
-
-Each process is made of 3 threads: a receiver thread, a computing
-thread, and a load-balancer thread.
-
-* Receiver thread
-  ---------------
-
-    Loop
-    | wait for a message to come, either on data channel, or on ctrl channel
-    | push received message in a buffer of received messages
-    | -> ctrl messages on the one side
-    | -> data messages on the other side
-    +-
-
-   The loop terminates when a "finalize" message is received on each
-   channel.
-
-* Computing thread
-  ----------------
-
-    Loop
-    | if we received some real load, get it (data messages)
-    | if there is some real load to send, send it
-    | if we own some load, simulate some computing on it
-    | sleep a bit if we are looping too fast
-    +-
-    send CLOSE on data for all neighbors
-    wait for CLOSE on data from all neighbors
-
-  The loop terminates when process::still_running() returns false.
-  (read the source for full details...)
-
-* Load-balancing thread
-  ---------------------
-
-    Loop
-    | call load-balancing algorithm
-    | send ctrl messages
-    | sleep (min_lb_iter_duration)
-    | receive ctrl messages
-    +-
-    send CLOSE on ctrl for all neighbors
-    wait for CLOSE on ctrl from all neighbors
+In the simulation model the processors exchange messages which are of
+two kinds.  First, there are \emph{control messages} which only carry
+information that is exchanged between the processors, such as the
+current load, or the virtual load transfers if this option is
+selected.  These messages are rather small, and their size is
+constant.  Then, there are \emph{data messages} that carry the real
+load transferred between the processors.  The size of a data message
+is a function of the amount of load that it carries, and it can be
+pretty large.  In order to receive the messages, each processor has
+two receiving channels, one for each kind of messages.  Finally, when
+a message is sent or received, this is done by using the non-blocking
+primitives of SimGrid\footnote{That are \texttt{MSG\_task\_isend()},
+  and \texttt{MSG\_task\_irecv()}.}.
+
+During the simulation, each processor concurrently runs three threads:
+a \emph{receiving thread}, a \emph{computing thread}, and a
+\emph{load-balancing thread}, which we will briefly describe now.
+
+\paragraph{Receiving thread} The receiving thread is in charge of
+waiting for messages to come, either on the control channel, or on the
+data channel.  When a message is received, it is pushed in a buffer of
+received message, to be later consumed by one of the other threads.
+There are two such buffers, one for the control messages, and one for
+the data messages.  The buffers are implemented with a lock-free FIFO
+\cite{sutter.2008.writing} to avoid contention between the threads.
+
+\paragraph{Computing thread} The computing thread is in charge of the
+real load management.  It iteratively runs the following operations:
+\begin{itemize}
+\item if some load was received from the neighbors, get it;
+\item if there is some load to send to the neighbors, send it;
+\item run some computation, whose duration is function of the current
+  load of the processor.
+\end{itemize}
+Practically, after the computation, the computing thread waits for a
+small amount of time if the iterations are looping too fast (for
+example, when the current load is zero).
 
-  The loop terminates when process::still_running() returns false.
-  (read the source for full details...)
-\end{verbatim}
+\paragraph{Load-balancing thread} The load-balancing thread is in
+charge of running the load-balancing algorithm, and exchange the
+control messages.  It iteratively runs the following operations:
+\begin{itemize}
+\item get the control messages that were received from the neighbors;
+\item run the load-balancing algorithm;
+\item send control messages to the neighbors, to inform them of the
+  processor's current load, and possibly of virtual load transfers;
+\item wait a minimum (configurable) amount of time, to avoid to
+  iterate too fast.
+\end{itemize}
 
 \subsection{Validation of our approaches}
 \label{Results}
@@ -381,4 +381,4 @@ Taille : 10 100 trÃ¨s gros
 
 % LocalWords:  RaphaÃ«l Couturier Arnaud Giersch Abderrahmane Sider Franche ij
 % LocalWords:  Bertsekas Tsitsiklis SimGrid DASUD ComtÃ© BÃ©jaÃ¯a asynchronism ji
-% LocalWords:  ik
+% LocalWords:  ik isend irecv
-- 
2.39.5