X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/mpi-energy2.git/blobdiff_plain/5ef38a410118e911f3dd1a8c5e3cafe44227e013..fe79318beef87d419e3952d9a1140372bd71bf4e:/mpi-energy2-extension/Heter_paper.tex?ds=sidebyside diff --git a/mpi-energy2-extension/Heter_paper.tex b/mpi-energy2-extension/Heter_paper.tex index d6ff46f..a7aa66d 100644 --- a/mpi-energy2-extension/Heter_paper.tex +++ b/mpi-energy2-extension/Heter_paper.tex @@ -842,6 +842,9 @@ which on average is up 26\%. Therefore, the tradeoff distance is related linear percentage. Finally, the best energy and performance tradeoff depends on the all of the following: 1) the computations to communications ratio when there is a communications and slack times, 2) the differences in computing powers between the computing nodes and 3) the differences in static and the dynamic powers of the nodes.} + + + \subsection{The experimental results of multicores clusters} \label{sec.res-mc} The grid'5000 clusters have different number of cores embedded in their nodes @@ -994,17 +997,175 @@ Scenario name & Cluster name & \begin{tabular}[c]{@{}c@ \label{fig:dist-mc} \end{figure} -\subsection{The results for different power consumption scenarios} -\label{sec.compare} +\subsection{The results of using different static power consumption scenarios} +\label{sec.pow_sen} +The static power consumption for one core of the computing node is the leakage power +consumption when this core is in the idle state. The node's idle state power value that measured +as in section \ref{sec.grid5000} had many power consumptions embedded such as +all cores static powers in addition to the power consumption of the other devices. So, the static power for one core +can't measured precisely. On the other hand, while the static power consumption of +one core representing the core's power when there is no any computation, thus +the majority of ratio of the total power consumption is depends on the dynamic power consumption. +Despite that, the static power consumption is becomes more important when the execution time +increased using DVFS. Therefore, the objective of this section is to verify the ability of the proposed +frequencies selecting algorithm when the static power consumption is changed. + +All the results obtained in the previous sections depend on the measured dynamic power +consumptions as in table \ref{table:grid5000}. Moreover, the static power consumption is assumed for +one core represents 20\% of the measured dynamic power of that core. +This assumption is extended in this section to taking into account others ratios for the static power consumption. +In addition to the previous ratio of the static power consumption, two other scenarios are used which +all of these scenarios can be denoted as follow: +\begin{itemize} +\item 10\% of static power scenario +\item 20\% of static power scenario +\item 30\% of static power scenario +\end{itemize} +These three scenarios represented the ratio of the static power consumption that can be computed from +the dynamic power consumption of the core. The NAS benchmarks of class D are executed over 16 nodes +in the Nancy site using three clusters: Graphite, Graphene and Griffon. As same as used before, the one site 16 nodes +platform scenario explained in the last experiments, as in table \ref{tab:sc}, is uses to run +the NAS benchmarks with these static power scenarios. + \begin{figure} + \centering + \includegraphics[scale=0.5]{fig/eng_pow.eps} + \caption{The energy saving percentages for NAS benchmarks of the three power scenario} + \label{fig:eng-pow} +\end{figure} + +\begin{figure} + \centering + \includegraphics[scale=0.5]{fig/per_pow.eps} + \caption{The performance degradation percentages for NAS benchmarks of the three power scenario} + \label{fig:per-pow} +\end{figure} -\subsection{The comparison of the proposed scaling algorithm } +\begin{figure} + \centering + \includegraphics[scale=0.5]{fig/dist_pow.eps} + \caption{The tradeoff distance for NAS benchmarks of the three power scenario} + \label{fig:dist-pow} +\end{figure} + +\begin{figure} + \centering + \includegraphics[scale=0.47]{fig/three_scenarios.pdf} + \caption{Comparing the selected frequencies of MG benchmarks for three static power scenarios} + \label{fig:fre-pow} +\end{figure} + +The energy saving percentages of NAS benchmarks with these three static power scenarios are presented +in figure \ref{fig:eng_sen}. This figure showed the 10\% of static power scenario +gives the biggest energy saving percentage comparing to 20\% and 30\% static power +scenario. When using smaller ratio of static power consumption, the proposed +frequencies selecting algorithm selects smaller frequencies, bigger scaling factors, +because the static energy consumption not increased significantly the overall energy +consumption. Therefore, more energy reduction can be achieved when the frequencies are scaled down. +For example figure \ref{fig:fre-pow}, illustrated that the proposed algorithm +proportionally scaled down the new computed frequencies with the overall predicted energy +consumption. The results of 30\% static power scenario gives the smallest energy saving percentages +because the new selected frequencies produced smaller ratio in the reduced energy consumption. +Furthermore, The proposed algorithm tries to limit selecting smaller frequencies that increased +the static energy consumption if the static power consumption is increased. +The performance degradation percentages are presented in the figure \ref{fig:per-pow}, +the 30\% of static power scenario had less performance degradation percentage, because +bigger frequencies was selected due to the big ratio in the static power consumption. +The inverse was happens in the 20\% and 30\% scenario, the algorithm was selected +biggest frequencies, smaller scaling factors, according to this increased in the static power ratios. +The tradoff distance for the NAS benchmarks with these three static powers scenarios +are presented in the figure \ref{fig:dist}. The results showed that the tradeoff +distance is the best when the 10\% of static power scenario is used, and this percentage +is decreased for the other two scenarios propositionally to their static power ratios. +In EP benchmarks, the results of energy saving, performance degradation and tradeoff +distance are showed small differences when the these static power scenarios were used, +because this benchmark not has communications. The proposed algorithm is selected +same frequencies in this benchmark when all these static power scenarios are used. +The small differences in the results are due to the static power is consumed during the computation +times side by side to the dynamic power consumption, knowing that the dynamic power consumption +representing the highest ratio in the total power consumption of the core, then any change in +the static power during these times have less affect on the overall energy consumption. While the +inverse was happens for the rest of the benchmarks which have the communications +that increased the static energy consumption linearly to the mount of communications +in these benchmarks. + + + +\subsection{The comparison of the proposed frequencies selecting algorithm } \label{sec.compare_EDP} +The tradeoff between the energy consumption and the performance of the parallel +application had significant importance in the domain of the research. +Many researchers, \cite{EDP_for_multi_processors,Energy_aware_application_scheduling,Exploring_Energy_Performance_TradeOffs}, +are optimized the tradeoff between the energy and performance using the energy and delay product, $EDP=energy \times delay$. +This model is used by Spiliopoulos et al. algorithm \cite{Spiliopoulos_Green.governors.Adaptive.DVFS}, +the objective is to selects the suitable frequencies that minimized EDP product for the multicores +architecture when DVFS is used. Moreover, their algorithm is applied online which synchronously optimized the energy consumption +and the execution time. Both energy consumption and execution time of a processor are predicted by the their algorithm. +In this section the proposed frequency selection algorithm, called Maxdist is compared with Spiliopoulos et al. algorithm, called EDP. +To make both of the algorithms follow the same direction and fairly comparing them, the same energy model, equation \ref{eq:energy} and +the execution time model, equation \ref{eq:perf}, are used in the prediction process to select the best vector of the frequencies. +In contrast, the proposed algorithm starts the search space from the lower bound computed as in equation the \ref{eq:Fint}. Also, the algorithm +stops the search process when reaching to the lower bound as mentioned before. While, the EDP algorithm is developed to start from the +same upper bound until it reach to the minimum available frequencies. Finally, resulting the algorithm is an exhaustive search algorithm that +test all possible frequencies, starting from the initial frequencies, and selecting those minimized the EDP products. + +Both algorithms were applied to NAS benchmarks class D over 16 nodes selected from grid'5000 clusters. +The participating computing nodes are distributed between two sites to had two different scenarios. +These scenarios are two sites and one site scenarios that explained previously. +The experimental results of the energy saving, performance degradation and tradeoff distance are +presented in the figures \ref{fig:edp-eng}, \ref{fig:edp-perf} and \ref{fig:edp-dist} respectively. + +In one site scenario the proposed frequencies selection algorithm outperform the EDP algorithm +in term of energy and performance for all of the benchmarks. While, the compassion results from the two sites scenario +showed that the proposed algorithm outperform EDP algorithm for all benchmarks except MG benchmark. +In case of MG benchmark the are small communications and bigger frequencies selected in EDP algorithm +decreased the performance degradation more than the frequencies selected by Maxdist algorithm. +While the energy saving percentage are higher for Maxdist algorithm. + +Generally, the proposed algorithm gives better results for all benchmarks because it +optimized the distance between the energy saving and the performance degradation. +Whereas, in EDP algorithm gives negative tradeoff for some benchmarks in the two sites scenarios. +These negative tradeoffs mean the performance degradation percentage is higher than energy saving percentage. +The higher positive value for tradeoff distance is mean the best energy and performance tradeoff is achieved synchronously, when +the energy saving percentage is much higher than the performance degradation percentage +The time complexity of the proposed algorithm is $O(N \cdot M \cdot F)$, where $N$ is the number of the clusters, +$M$ is the number of nodes and $F$ is the maximum number of available frequencies. The algorithm is selected +the best frequencies in small execution time, on average is equal to 0.01 $ms$ when it works over 32 nodes. +While the EDP algorithm was slower than Maxdist algorithm by ten times, where their execution time on average +takes 0.1 $ms$ to selects the suitable frequencies over 32 nodes. +The time complexity of this algorithm is $O(N^2 \cdot M^2 \cdot F)$. + + + + + + + +\begin{figure} + \centering + \includegraphics[scale=0.5]{fig/edp_eng} + \caption{Comparing of the energy saving for the proposed method with EDP method} + \label{fig:edp-eng} +\end{figure} + +\begin{figure} + \centering + \includegraphics[scale=0.5]{fig/edp_per} + \caption{Comparing of the performance degradation for the proposed method with EDP method} + \label{fig:edp-perf} +\end{figure} +\begin{figure} + \centering + \includegraphics[scale=0.5]{fig/edp_dist} + \caption{Comparing of the tradeoff distance for the proposed method with EDP method} + \label{fig:edp-dist} +\end{figure} + \section{Conclusion} \label{sec.concl}