new

[loba-papers.git] / loba-besteffort / loba-besteffort.tex
diff --git a/loba-besteffort/loba-besteffort.tex b/loba-besteffort/loba-besteffort.tex

index 5570b359a51ad754360f6c06cf66c6ec042684f2..00cb3c5fef58bedbad7e648dd8284db3cc1d89bc 100644 (file)
--- a/loba-besteffort/loba-besteffort.tex
+++ b/loba-besteffort/loba-besteffort.tex
@@ -653,7 +653,7 @@ With these constraints in mind, we defined the following metrics:
  \label{sec.results}
  
  In this section, the results for the different simulations will be presented,
-and we'll try to explain our observations.
+and we will try to explain our observations.
  
  \subsubsection{Cluster vs grid platforms}
  
@@ -724,57 +724,84 @@ allocated time, or because we simply decided not to run it.
  
  \FIXME{annoncer le plan de la suite}
  
-\subsubsection{The \besteffort{} strategy with the load initially on only one
-  node}
+\subsubsection{The \besteffort{} and  \makhoul{} strategies without virtual load}
  
-Before looking at the different variations, we'll first show that the plain
-\besteffort{} strategy is valuable, and may be as good as the \makhoul{}
-strategy.  On the graphs from the figure~\ref{fig.results1}, these strategies
-are respectively labeled ``b'' and ``a''.
+Before looking  at the different variations,  we will first show  that the plain
+\besteffort{}  strategy  is valuable,  and  may be  as  good  as the  \makhoul{}
+strategy.  On  Figures~\ref{fig.results1} and~\ref{fig.resultsN},
+these strategies are respectively labeled ``b'' and ``a''.
  
-We can see that the relative performance of these strategies is mainly
-influenced by the application topology.  It's for the line topology that the
-difference is the more important.  In this case, the \besteffort{} strategy is
-nearly twice as fast as the \makhoul{} strategy.
+We  can  see  that  the  relative  performance of  these  strategies  is  mainly
+influenced by  the application topology.  It  is for the line  topology that the
+difference is the  more important.  In this case,  the \besteffort{} strategy is
+nearly faster than the \makhoul{} strategy.  This can  be explained by the
+fact that the \besteffort{} strategy tries to distribute the load fairly between
+all the nodes  and with the line topology,  it is easy to load  balance the load
+fairly.
  
  On the contrary, for the hypercube topology, the \besteffort{} strategy performs
-worse than the \makhoul{} strategy.
+worse than the \makhoul{} strategy. In this case, the \makhoul{} strategy which
+tries to give more load to few neighbors reaches the equilibrium faster.
  
-Finally, the results are more nuanced for the torus topology.
+For the torus  topology, for which the  number of links is between  the line and
+the hypercube, the \makhoul{} strategy  is slightly better but the difference is
+more nuanced when the initial load is  only on one node. The only case where the
+\makhoul{} strategy is really faster than the \besteffort{} strategy is with the
+random initial distribution when the communication are slow.
  
-This can be explained by ...
+Globally   the  number  of   interconnection  is   very  important.    The  more
+the interconnection links are, the  faster the \makhoul{} strategy is because
+it distributes quickly significant amount of load, even if this is unfair, between
+all the  neighbors.  In opposition,  the \besteffort{} strategy  distributes the
+load fairly so this strategy is better for low connected strategy.
  
--> interconnection
  
-plus c'est connecté, moins bon est BE car à vouloir trop bien équilibrer
-localement, le processeurs se perturbent mutuellement.  Du coup, makhoul qui
-équilibre moins bien localement est moins perturbé par ces interférences.
+\subsubsection{Virtual load}
  
-\subsubsection{With the virtual load extension with the load initially on only
-  one node}
+The influence of virtual load is most of the time really significant compared to
+the  same configuration  without  it. Sometimes  it  has no  effect  but {\bf  A
+  VERIFIER} it has never a negative effect on the load balancing we tested.
  
-Dans ce cas légère amélioration de la cvg. max.  Temps moyen de cvg. amélioré,
-mais plus de temps passé en idle, surtout quand les comms coutent cher.
+On Figure~\ref{fig.results1}, when the load is  initially on one node, it can be
+noticed that the  average idle times are generally longer  with the virtual load
+than without  it. This  can be explained  by the  fact that, with  virtual load,
+processors  will exchange all  the load  they need  to exchange  as soon  as the
+virtual load has been balanced  between all the processors. So consequently they
+cannot  compute  at  the  beginning.  This is  especially  noticeable  when  the
+communication are slow (on the left part of Figure ~\ref{fig.results1}.
  
-\subsubsection{The \besteffort{} strategy with an initial random load
-  distribution, and larger platforms}
+%Dans ce cas  légère amélioration de la cvg. max.  Temps  moyen de cvg. amélioré,
+%mais plus de temps passé en idle, surtout quand les comms coutent cher.
  
-Mêmes conclusions pour line et hcube.
-Sur tore, BE se fait exploser quand les comms coutent cher.
+%\subsubsection{The \besteffort{} strategy with an initial random load
+%  distribution, and larger platforms}
  
-\FIXME{virer les 1024 ?}
+%In 
+%Mêmes conclusions pour line et hcube.
+%Sur tore, BE se fait exploser quand les comms coutent cher.
  
-\subsubsection{With the virtual load extension with an initial random load
-  distribution}
+%\FIXME{virer les 1024 ?}
  
-Soit c'est équivalent, soit on gagne -> surtout quand les comms coutent cher et
-qu'il y a beaucoup de voisins.
+%\subsubsection{With the virtual load extension with an initial random load
+%  distribution}
+
+%Soit c'est équivalent, soit on gagne -> surtout quand les comms coutent cher et
+%qu'il y a beaucoup de voisins.
  
  \subsubsection{The $k$ parameter}
  \label{results-k}
  
-Dans le cas où les comms coutent cher et ou BE se fait avoir, on peut ameliorer
-les perfs avec le param k.
+As  explained  previously when  the  communication  are  slow the  \besteffort{}
+strategy is efficient. This is due to the fact that it tries to balance the load
+fairly and consequently  a significant amount of the  load is transfered between
+processors.  In this situation, it is possible to reduce the convergence time by
+using  the leveler  parameter  (parameter  $k$).  The  advantage  of using  this
+solution is particularly efficient when the initial load is randomly distributed
+on  the nodes with  torus and  hypercube topology  and slow  communication. When
+virtual load  mechanism is used,  the effect of  this parameter is  also visible
+with the same condition.
+
+
  
  \subsubsection{With integer load, 1 ou N}