+The impact of virtual load scheme is most of the time really significant compared to
+the simple version of the algorithm with the same configuration. %Sometimes it has no effect but, based on our observations, it has never a negative effect on the load balancing we tested.
+For instance, as can be seen from Figure~\ref{fig.results1}, when the load is initially on one node, it can be
+noticed that the average idle times are generally longer with the virtual load
+than the simple version. This can be explained by the fact that, with virtual load,
+processors will exchange all the load they need to exchange as soon as the
+virtual load has been balanced between all the processors. As a consequence, they
+cannot compute at the beginning. This is especially noticeable when the
+communication are slow (on the left part of Figure ~\ref{fig.results1}).
+
+\smallskip
+When the load to balance is initially randomly distributed over all nodes, we can see from Figure \ref{fig.resultsN} that the effect of virtual load is not significant for the line topology structure. However, for both torus and hypercube structures with CCR = 1/10 (on the left of the figure), the performance of virtual load transfers is significantly better. This is explained by the fact
+that for small CCR values, high communication costs plays quite a significant role. However, the impact of
+communication becomes less important as the CCR values increases, since larger CCR values result in smaller communication times. We also tested the impact of CCR values on the performance of each algorithm in terms of idle times. From Figures~\ref{fig.results1} and ~\ref{fig.resultsN} we can find that our virtual load scheme achieves
+a really good average idle times, which is quite close to both its own simple version and its direct competitor {\it Bertsekas and Tsitsiklis} algorithm. As expected, for coarse grain applications (CCR =10/1), idle times are close to 0 since processors are inactive the most of times compared to fine grain applications.
+
+\smallskip
+Taken as a whole, the results illustrated in Figures~\ref{fig.results1} and ~\ref{fig.resultsN} clearly show that our proposal outperforms the Betsekas and Tsistlikis algorithm.
+These results indicate that local load balancing decisions has a significant impact on the global
+convergence time achieved by the compared strategies. This is because, upon load imbalance detection, assigning an amount of load in an unfair way between neighbors will severely increase the total number of iterations required by the algorithm before reaching the final stable distributions. The reason of the poorer performance of {\it Bertsekas and tsistsilikis} algorithm can be explained by the inconvenience of the iterative load balance policy adopted for load distribution between neighbors. Neighbors are selected in such a way that the {\it ping-pong} condition holds. Doing so, loads are not really assigned to processor neighbors which would allow them to be fairly balanced.
+
+\smallskip
+Unlike {\it Betsekas and Tsistlikis} algorithm, our approach is not really sensitive when
+we deal with realistic models of computation and communication. This is due to two main features: i) the use of "virtual load" transfers winch allows nodes to predict the load they receive in the subsequent iterations steps, ii) and the greedy neighbors selection adopted by our algorithm at each time step in the load balancing process. The involved neighbors are selected in such a way that load difference between the computational resources is minimized as low as possible.
+
+\smallskip
+Comparing the results of the extended version (with virtual load) to the results of the simple one, we observe in Figs.~\ref{fig.results1} and ~\ref{fig.resultsN} that the improved version gives the best performances. It always improves both convergence and idle times significantly in all figures. This is because, with virtual load transfers, the algorithm seeks greedily to ensure a certain degree of load balancing for processors by taking into account the information about the predictive loads not received yet. Consequently, this leads to optimize the final convergence time of the load balancing process. Similarly, the extended version achieves much better results than the simple one when considering larger platforms, as shown in Figs.~\ref{fig.results1} and ~\ref{fig.resultsN}.
+
+\smallskip
+We also find in Figs.~\ref{fig.results1} and ~\ref{fig.resultsN} that the performance difference between the improved version of our proposal and its simple version (without virtual load) increases when the CCR increases. This interesting result comes from the fact that larger CCR values reveals that we are dealing with intensive computations applications in grid platforms. Thus, in order to reduce the convergence time of the load balancing for such applications, it is important to take suitable decisions upon local load imbalance detection. That is why we added {\it virtual load} transfers scheme to the {\it best effort} strategy to perfectly balance the load of processors at each step of the load balancing process.
+
+\smallskip
+Finally, it is worthwhile noting from Figures~\ref{fig.results1} and ~\ref{fig.resultsN}, that the algorithm's convergence time increases together with network's size. We also see that the idle time increases together with the size of the network when a load is initially on a single node (Figure~\ref{fig.results1}),
+as expected. In addition, it is interesting to note that when the number of nodes increases, there is not substantial difference in the increase of the convergence time, compared to the simple version without virtual load. This is explained by the fact that the increase in the convergence time is already absorbed by the virtual load transfers between processors being in line with the network's size.
+
+%For the hypercube, in any case, the effect of the virtual load is visible. It is more visible when communications have a more important role (i.e. with the mainly communicating case).
+
+
+%Dans ce cas légère amélioration de la cvg. max. Temps moyen de cvg. amélioré,
+%mais plus de temps passé en idle, surtout quand les comms coutent cher.
+
+%\subsubsection{The \besteffort{} strategy with an initial random load
+% distribution, and larger platforms}
+
+%In
+%Mêmes conclusions pour line et hcube.
+%Sur tore, BE se fait exploser quand les comms coutent cher.
+
+%\FIXME{virer les 1024 ?}
+
+%\subsubsection{With the virtual load extension with an initial random load
+% distribution}
+
+%Soit c'est équivalent, soit on gagne -> surtout quand les comms coutent cher et
+%qu'il y a beaucoup de voisins.