Correct fax number, add telephone number.

[mpi-energy.git] / paper.tex
diff --git a/paper.tex b/paper.tex

index e1a6c6aaa68cd7992f69c69d590ef40ec03f1327..ffed39e33e16d77acae71a3fbad4ee2aade0cb56 100644 (file)
--- a/paper.tex
+++ b/paper.tex
@@ -36,7 +36,8 @@
      FEMTO-ST Institute\\
      University of Franche-Comté\\
      IUT de Belfort-Montbéliard, 19 avenue du Maréchal Juin, BP 527, 90016 Belfort cedex, France\\
-   % Fax  : +33~3~84~58~77~32\\
+    % Telephone: \mbox{+33 3 84 58 77 86}, % Raphaël
+    % Fax: \mbox{+33 3 84 58 77 81}\\      % Dept Info
      Email: \email{{jean-claude.charr,raphael.couturier,ahmed.fanfakh_badri_muslim,arnaud.giersch}@univ-fcomte.fr}
     }
    }
@@ -118,7 +119,6 @@ we conclude in Section~\ref{sec.concl} with a summary and some future works.
  \section{Related works}
  \label{sec.relwork}
  
-\AG{Consider introducing the models (sec.~\ref{sec.exe}) before related works}
  
  In this section, some heuristics to compute the scaling factor are
  presented and classified into two categories: offline and online methods.
@@ -133,7 +133,7 @@ values could be computed based on information retrieved by analyzing the code of
  the program and the computing system that will execute it. In ~\cite{40},
  Azevedo et
  al. detect during compilation the dependency points between  
-tasks in a parallel program. This information is then used to lower the frequency of
+tasks in a multi-task program. This information is then used to lower the frequency of
  some processors in order to eliminate slack times. A slack time is the period of time during which a processor that have already finished its computation, have to wait
  for a set of processors to finish their computations and send their results to the
  waiting processor in order to continue its task that is
@@ -156,7 +156,7 @@ To maintain the performance of the parallel program , they
  set the  processor with the biggest load to the highest gear and then compute the scaling  factor values for the rest of the processors. Although this model was built for parallel architectures, it can be adapted  to distributed architectures by taking into account the communications. 
  The primary contribution of our paper is presenting a new online scaling factor selection method which has the following characteristics :
  \begin{enumerate}
-\item It is based on Rauber and Rünger analytical model to predict the energy consumption and the execution time of the application with different frequency gears. 
+\item It is based on Rauber and Rünger analytical model to predict the energy consumption  of the application with different frequency gears. 
  \item It selects the frequency scaling factor for simultaneously optimizing energy reduction and maintaining performance.
  \item It is well adapted to distributed architectures because it takes into account the communication time.
  \item It is well adapted to distributed applications with imbalanced tasks.
@@ -269,7 +269,6 @@ EQ~(\ref{eq:energy}). The optimal scaling factor is computed by minimizing the d
      \left( 1 + \sum_{i=2}^{N} \frac{T_i^3}{T_1^3} \right) }
  \end{equation}
  
-\JC{The following 2 sections can be merged easily}
  
  \section{Performance evaluation of MPI programs}
  \label{sec.mpip}
@@ -484,7 +483,7 @@ frequency by the new one see EQ~(\ref{eq:s}).
  In our cluster there are 18 available frequency states for each processor. 
  This leads to 18 run states for each program. We use seven MPI programs of the
   NAS parallel benchmarks: CG, MG, EP, FT, BT, LU
-and SP. Figure~(\ref{fig:pred}) presents plots of the real execution times and the simulated ones. The maximum normalized error between the predicted execution time and the real time (SimGrid time) for all programs is between 0.0073 to 0.031. The  better case is for CG and the worse case is for LU. 
+and SP. Figure~(\ref{fig:pred}) presents plots of the real execution times and the simulated ones. The maximum normalized error between these two execution times varies between 0.0073 to 0.031 dependent on the executed benchmark. The smallest prediction error was for CG and the worst one was for LU. 
  \subsection{The experimental results for the scaling algorithm }
  The proposed algorithm was applied to seven MPI programs of the NAS
  benchmarks (EP, CG, MG, FT, BT, LU and SP) which were run with three classes (A, B and