\begin{frontmatter}
-\journal{Parallel Computing}
+\journal{Journal of Computational Science}
\title{Best effort strategy and virtual load for\\
asynchronous iterative load balancing}
In order to reduce this effect, the ability to level the amount of load to send is added.
The idea, here, is to make as few steps as possible toward the equilibrium, such that a
potentially unsuitable decision pointed above has a lower impact on the local equilibrium.
-A weighting system parameter $k$ is introduced to orchestrate the right balance between the topology structure and the computation to communication ratios (CCR) values of the deployed application. Indeed, to speedup the convergence time of the load balancing process, one is faced with a difficult trade-off to choose an appropriate amount of load to send between node neighbors upon load imbalance detection. On the one hand, if $k$ is small, faster convergence times are expected for sparsely connected applications and large CCR values. On the other hand, for strongly connected applications and small CCR values, a large value of $k$ will enable us to better balance the load locally and therefore minimize the number of iterations toward the global equilibrium. In the experiments section (Section~\ref{sec.results}), it can be observed that choosing $k$ in 1,2 or 4, leads to good results for the considered CCR values and the targeted topology structures.
+A weighting system parameter $k$ is introduced to orchestrate the right balance between the topology structure and the computation to communication ratios (CCR) values of the deployed application. Indeed, to speedup the convergence time of the load balancing process, one is faced with a difficult trade-off to choose an appropriate amount of load to send between node neighbors upon load imbalance detection. On the one hand, if $k$ is small, faster convergence times are expected for sparsely connected applications and large CCR values. On the other hand, for strongly connected applications and small CCR values, a large value of $k$ will enable us to better balance the load locally and therefore minimize the number of iterations toward the global equilibrium. In the experiments section (Section~\ref{sec.results}), it can be observed that choosing $k$ in $\{1, 2, 4\}$ leads to good results for the considered CCR values and the targeted topology structures.
So the amount of data to send is then $s_{ij}(t) = (\bar{x} - x^i_j(t))/k$.