FEMTO-ST Institute\\
University of Franche-Comté\\
IUT de Belfort-Montbéliard, 19 avenue du Maréchal Juin, BP 527, 90016 Belfort cedex, France\\
- % Fax : +33~3~84~58~77~32\\
+ % Telephone: \mbox{+33 3 84 58 77 86}, % Raphaël
+ % Fax: \mbox{+33 3 84 58 77 81}\\ % Dept Info
Email: \email{{jean-claude.charr,raphael.couturier,ahmed.fanfakh_badri_muslim,arnaud.giersch}@univ-fcomte.fr}
}
}
\section{Related works}
\label{sec.relwork}
-\AG{Consider introducing the models (sec.~\ref{sec.exe}) before related works}
In this section, some heuristics to compute the scaling factor are
presented and classified into two categories: offline and online methods.
the program and the computing system that will execute it. In ~\cite{40},
Azevedo et
al. detect during compilation the dependency points between
-tasks in a parallel program. This information is then used to lower the frequency of
+tasks in a multi-task program. This information is then used to lower the frequency of
some processors in order to eliminate slack times. A slack time is the period of time during which a processor that have already finished its computation, have to wait
for a set of processors to finish their computations and send their results to the
waiting processor in order to continue its task that is
set the processor with the biggest load to the highest gear and then compute the scaling factor values for the rest of the processors. Although this model was built for parallel architectures, it can be adapted to distributed architectures by taking into account the communications.
The primary contribution of our paper is presenting a new online scaling factor selection method which has the following characteristics :
\begin{enumerate}
-\item It is based on Rauber and Rünger analytical model to predict the energy consumption and the execution time of the application with different frequency gears.
+\item It is based on Rauber and Rünger analytical model to predict the energy consumption of the application with different frequency gears.
\item It selects the frequency scaling factor for simultaneously optimizing energy reduction and maintaining performance.
\item It is well adapted to distributed architectures because it takes into account the communication time.
\item It is well adapted to distributed applications with imbalanced tasks.
\left( 1 + \sum_{i=2}^{N} \frac{T_i^3}{T_1^3} \right) }
\end{equation}
-\JC{The following 2 sections can be merged easily}
\section{Performance evaluation of MPI programs}
\label{sec.mpip}
In our cluster there are 18 available frequency states for each processor.
This leads to 18 run states for each program. We use seven MPI programs of the
NAS parallel benchmarks: CG, MG, EP, FT, BT, LU
-and SP. Figure~(\ref{fig:pred}) presents plots of the real execution times and the simulated ones. The maximum normalized error between the predicted execution time and the real time (SimGrid time) for all programs is between 0.0073 to 0.031. The better case is for CG and the worse case is for LU.
+and SP. Figure~(\ref{fig:pred}) presents plots of the real execution times and the simulated ones. The maximum normalized error between these two execution times varies between 0.0073 to 0.031 dependent on the executed benchmark. The smallest prediction error was for CG and the worst one was for LU.
\subsection{The experimental results for the scaling algorithm }
The proposed algorithm was applied to seven MPI programs of the NAS
benchmarks (EP, CG, MG, FT, BT, LU and SP) which were run with three classes (A, B and
\section*{Acknowledgment}
-Computations have been performed on the supercomputer facilities of the
+This work has been supported by the Labex ACTION project (contract ``ANR-11-LABX-01-01'').Computations have been performed on the supercomputer facilities of the
Mésocentre de calcul de Franche-Comté. As a PhD student, M. Ahmed Fanfakh, would like to thank the University of
Babylon (Iraq) for supporting his work.