corrections

[mpi-energy2.git] / mpi-energy2-extension / Heter_paper.tex
diff --git a/mpi-energy2-extension/Heter_paper.tex b/mpi-energy2-extension/Heter_paper.tex

index 329d52665d9659a659f1ff1ccc978bbe67aabd6e..9108cb1280e4f0c9757839a22b1761a96d5bc9e5 100644 (file)
--- a/mpi-energy2-extension/Heter_paper.tex
+++ b/mpi-energy2-extension/Heter_paper.tex
@@ -392,7 +392,7 @@ where $N$ is the number of  clusters in the grid, $M_i$ is the number of  nodes
  and $\Tcm[hj]$ is the communication time of processor $j$ in the cluster $h$ during the 
  first  iteration.  The execution time for one iteration is equal to the sum of the maximum computation time for all nodes with the new scaling factors
  and the  communication time of the slowest node without slack time during one iteration.
- The slowest node $h$ is the node which takes the  maximum execution time to execute an iteration  before scaling down its  frequency.
+The slowest node in cluster $h$ is the node which takes the  maximum execution time to execute an iteration  before scaling down its  frequency.
  It means that only the communication time without any slack time is taken into account.
  Therefore, the execution time of the  application is equal to
  the execution time of one iteration as in Equation (\ref{eq:perf}) multiplied by the
@@ -512,7 +512,7 @@ static energies for $M_i$ processors in $N$ clusters.  It is computed as follows
   E = \sum_{i=1}^{N} \sum_{i=1}^{M_i} {(S_{ij}^{-2} \cdot \Pd[ij] \cdot  \Tcp[ij])} +  
   \sum_{i=1}^{N} \sum_{j=1}^{M_i} (\Ps[ij] \cdot {} \\
    (\mathop{\max_{i=1,\dots N}}_{j=1,\dots,M_i}({\Tcp[ij]} \cdot S_{ij}) 
-  +\mathop{\min_{j=1,\dots M_i}} (\Tcm[hj]) ))
+  +\mathop{\min_{j=1,\dots M_h}} (\Tcm[hj]) ))
  \end{multline}
  
  
@@ -596,13 +596,13 @@ computed as in (\ref{eq:eorginal}).
  While the main goal is to optimize the energy and execution time at the same
  time, the normalized energy and execution time curves do not evolve (increase/decrease) in the same way. 
  According to (\ref{eq:pnorm}) and (\ref{eq:enorm}), the
-vector of frequency scaling factors $S_1,S_2,\dots,S_N$ reduces both the energy
+vector of frequency scaling factors $S_{11},S_{12},\dots,S_{NM_i}$ reduces both the energy
  and the execution time,  but the main objective is to produce
  maximum energy reduction with minimum execution time reduction.
  
  This problem can be solved by making the optimization process for energy and
  execution time follow the same evolution according to the vector of scaling factors
-$(S_{11}, S_{12},\dots, S_{NM})$. Therefore, the equation of the
+$(S_{11}, S_{12},\dots, S_{NM_i})$. Therefore, the equation of the
  normalized execution time is inverted which gives the normalized performance
  equation, as follows:
  \begin{equation}
@@ -1033,7 +1033,7 @@ nodes when the communications occur in high speed network does not decrease the
  communication ratio. 
  
  The performance degradation percentage of the EP benchmark after applying the scaling factors selection algorithm is the highest in comparison to 
-the other benchmarks. Indeed, in the EP benchmark, there are no communication and slack times and its 
+the other benchmarks. Indeed, in the EP benchmark, there are no communication and no slack times and its 
  performance degradation percentage only depends on the frequencies values selected by the algorithm for the computing nodes.
  The rest of the benchmarks showed different performance degradation percentages which decrease
  when the communication times increase and vice versa.
@@ -1098,7 +1098,7 @@ Scenario name                          & Cluster name & Nodes per cluster &
  
  The execution times for most of  the NAS  benchmarks are higher over the multi-core per node scenario 
  than over the single core per node  scenario. Indeed,  
- the communication times  are higher in the one site multi-core scenario than in the latter scenario because all the cores of a node  share  the same node network link which can be  saturated when running communication bound applications. Moreover, the cores of a node share the memory bus which can be also saturated and become a bottleneck.    
+ the communication times  are higher in the  multi-core scenario than in the latter scenario because all the cores of a node  share  the same node network link which can be  saturated when running communication bound applications. Moreover, the cores of a node share the memory bus which can be also saturated and become a bottleneck.    
  Moreover, the energy consumptions of the NAS benchmarks are lower over the 
   one core scenario  than over the multi-core scenario because 
  the first scenario had less execution time than the latter which results in less static energy being consumed.
@@ -1266,7 +1266,8 @@ the global convergence of the iterative system. Finally, it would be interesting
  \section*{Acknowledgment}
  
  This work  has been  partially supported by  the Labex ACTION  project (contract
-``ANR-11-LABX-01-01'').  Computations  have been performed  on the Grid'5000 platform. As  a  PhD student,
+``ANR-11-LABX-01-01'').  Computations  have been performed  on the Grid'5000
+platform and on the mésocentre of Franche-Comté. As  a  PhD student,
  Mr. Ahmed  Fanfakh, would  like to  thank the University  of Babylon  (Iraq) for
  supporting his work.