X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/mpi-energy2.git/blobdiff_plain/8f9f451ef1d8fc5aeba735be77e4000588baffdb..f48b065ad7039e174cf0042b534ac697c719decb:/mpi-energy2-extension/Heter_paper.tex diff --git a/mpi-energy2-extension/Heter_paper.tex b/mpi-energy2-extension/Heter_paper.tex index 85f68f4..4b55dd4 100644 --- a/mpi-energy2-extension/Heter_paper.tex +++ b/mpi-energy2-extension/Heter_paper.tex @@ -208,14 +208,24 @@ reductions. All the experimental results were conducted over the SimGrid simulator \cite{SimGrid}, which offers easy tools to describe homogeneous and heterogeneous platforms, and to simulate the execution of message passing parallel applications over them. -In this paper, a new frequency selecting algorithm, adapted to grid platforms -composed of heterogeneous clusters, is presented. It is applied to the NAS + +This paper presents the following contributions : +\begin{enumerate} +\item two new energy and performance models for message passing + synchronous applications with iterations running over a heterogeneous grid platform. Both models + take into account communications and slack times. The models can predict the + required energy and the execution time of the application. + +\item a new online frequency selecting algorithm for heterogeneous grid + platforms. The algorithm has a very small overhead and does not need any + training nor profiling. It uses a new optimization function which + simultaneously maximizes the performance and minimizes the energy consumption + of a message passing synchronous application with iterations. The algorithm was applied to the NAS parallel benchmarks and evaluated over a real testbed, the Grid'5000 platform -\cite{grid5000}. It selects for a grid platform running a message passing - application with iterations the vector of frequencies that simultaneously tries to -offer the maximum energy reduction and minimum performance degradation -ratios. The algorithm has a very small overhead, works online and does not need -any training or profiling. +\cite{grid5000}. + +\end{enumerate} + This paper is organized as follows: Section~\ref{sec.relwork} presents some @@ -300,21 +310,7 @@ some heuristic. Chen et al.~\cite{Chen_DVFS.under.quality.of.service.requirements} used a greedy dynamic programming approach to minimize the power consumption of heterogeneous servers while respecting given time constraints. This approach had considerable -overhead. In contrast to the above described papers, this paper presents the -following contributions : -\begin{enumerate} -\item two new energy and performance models for message passing - synchronous applications with iterations running over a heterogeneous grid platform. Both models - take into account communication and slack times. The models can predict the - required energy and the execution time of the application. - -\item a new online frequency selecting algorithm for heterogeneous grid - platforms. The algorithm has a very small overhead and does not need any - training nor profiling. It uses a new optimization function which - simultaneously maximizes the performance and minimizes the energy consumption - of a message passing synchronous application with iterations. - -\end{enumerate} +overhead. @@ -388,15 +384,15 @@ vector of scaling factors can be predicted using Equation (\ref{eq:perf}). \begin{equation} \label{eq:perf} \Tnew = \mathop{\max_{i=1,\dots N}}_{j=1,\dots,M_i}({\TcpOld[ij]} \cdot S_{ij}) - +\mathop{\min_{j=1,\dots,M_i}} (\Tcm[hj]) + +\mathop{\min_{j=1,\dots,M_h}} (\Tcm[hj]) \end{equation} % where $N$ is the number of clusters in the grid, $M_i$ is the number of nodes in cluster $i$, $\TcpOld[ij]$ is the computation time of processor $j$ in the cluster $i$ and $\Tcm[hj]$ is the communication time of processor $j$ in the cluster $h$ during the first iteration. The execution time for one iteration is equal to the sum of the maximum computation time for all nodes with the new scaling factors -and the communication time of the slower node without slack time during one iteration. -The slower node $h$ is the node that gives the maximum execution time in all the clusters before applying DVFS. +and the communication time of the slowest node without slack time during one iteration. + The slowest node $h$ is the node which takes the maximum execution time to execute an iteration before scaling down its frequency. It means that only the communication time without any slack time is taken into account. Therefore, the execution time of the application is equal to the execution time of one iteration as in Equation (\ref{eq:perf}) multiplied by the @@ -547,8 +543,8 @@ frequency scaling factors for a homogeneous and a heterogeneous cluster respecti Both methods selects the frequencies that gives the best trade-off between energy consumption reduction and performance for message passing synchronous applications \textcolor{blue}{with iterations}. In this work we -are interested in grids that are composed of heterogeneous clusters, \textcolor{blue}{where} the nodes -have different characteristics such as dynamic power, static power, computation power, +are interested in grids that are composed of heterogeneous clusters. The nodes from distinct clusters may have + different characteristics such as dynamic power, static power, computation power, frequencies range, network latency and bandwidth. Due to the heterogeneity of the processors, a vector of scaling factors should be selected and it must give the best trade-off between energy consumption and performance. @@ -573,7 +569,7 @@ where $Tnew$ is computed as in (\ref{eq:perf}) and $Told$ is computed as in (\re \begin{equation} \label{eq:told} \Told = \mathop{\max_{i=1,\dots N}}_{j=1,\dots,M_i}({\TcpOld[ij]} ) - +\mathop{\min_{j=1,\dots,M_i}} (\Tcm[hj]) + +\mathop{\min_{j=1,\dots,M_h}} (\Tcm[hj]) \end{equation} } In the same way, the energy is normalized by computing the ratio between the