adding the comparison section

[mpi-energy2.git] / mpi-energy2-extension / Heter_paper.tex
diff --git a/mpi-energy2-extension/Heter_paper.tex b/mpi-energy2-extension/Heter_paper.tex

index d6ff46fa0ce81414674294b84b953daecbe30cbf..a7aa66d049ffbc06bbbf724587b4f503edcde042 100644 (file)
--- a/mpi-energy2-extension/Heter_paper.tex
+++ b/mpi-energy2-extension/Heter_paper.tex
@@ -842,6 +842,9 @@ which on average is up 26\%.  Therefore, the tradeoff distance is related linear
  percentage. Finally, the best energy and performance tradeoff depends on the all of the following:
  1) the computations to communications ratio when there is a communications and slack times, 2) the differences in computing powers 
  between the computing nodes and 3) the differences in static and the dynamic powers of the nodes.}
+
+
+
  \subsection{The experimental results of multicores clusters}
  \label{sec.res-mc}
  The grid'5000 clusters have different number of cores embedded in their nodes
@@ -994,17 +997,175 @@ Scenario name                          & Cluster name & \begin{tabular}[c]{@{}c@
    \label{fig:dist-mc}
  \end{figure}
  
-\subsection{The results for different power consumption scenarios}
-\label{sec.compare}
+\subsection{The results of using different static power consumption scenarios}
+\label{sec.pow_sen}
+The static power consumption for one core of the computing node is the leakage power
+consumption when this core is in the idle state. The node's idle state power value that measured 
+as in section \ref{sec.grid5000} had many power consumptions embedded  such as 
+all cores static powers in addition to the power consumption of the other devices. So, the static power for one core 
+can't measured precisely. On the other hand, while the static power consumption of
+one core representing the core's power when there is no any computation,  thus 
+the majority of ratio of the total power consumption is depends on the dynamic power consumption. 
+Despite that, the static power consumption is becomes more important  when the execution time 
+increased using DVFS. Therefore, the objective of this section is to verify the ability of the proposed 
+frequencies selecting algorithm when the static power consumption is changed. 
+
+All the results obtained in the previous sections depend on the measured dynamic power 
+consumptions as in table \ref{table:grid5000}. Moreover, the static power consumption is assumed for
+one core represents  20\% of the measured dynamic power of that core. 
+This assumption is extended in this section to taking into account others ratios for the static power consumption.
+In addition to the previous ratio of the static power consumption, two other scenarios are used which 
+all of these scenarios can be denoted as follow: 
+\begin{itemize}
+\item 10\% of static power scenario 
+\item 20\% of static power scenario 
+\item 30\% of static power scenario 
+\end{itemize}
  
+These three scenarios represented the ratio of the static power consumption that can be computed from 
+the dynamic power consumption of the core. The NAS benchmarks of class D are executed over 16 nodes
+in the Nancy site using three clusters: Graphite, Graphene and Griffon. As same as used before, the one site 16 nodes
+platform scenario explained in the last experiments, as in table \ref{tab:sc}, is uses to run 
+the NAS benchmarks with these static power scenarios. 
  
+ \begin{figure}
+  \centering
+  \includegraphics[scale=0.5]{fig/eng_pow.eps}
+  \caption{The energy saving percentages for NAS benchmarks of the three power scenario}
+  \label{fig:eng-pow}
+\end{figure}
+
+\begin{figure}
+  \centering
+  \includegraphics[scale=0.5]{fig/per_pow.eps}
+  \caption{The performance degradation percentages for NAS benchmarks of the three power scenario}
+  \label{fig:per-pow}
+\end{figure}
  
  
-\subsection{The comparison of the proposed scaling algorithm }
+\begin{figure}
+  \centering
+  \includegraphics[scale=0.5]{fig/dist_pow.eps}
+  \caption{The tradeoff distance for NAS benchmarks of the three power scenario}
+  \label{fig:dist-pow}
+\end{figure}
+
+\begin{figure}
+  \centering
+  \includegraphics[scale=0.47]{fig/three_scenarios.pdf}
+  \caption{Comparing the selected frequencies of MG benchmarks for three static power scenarios}
+  \label{fig:fre-pow}
+\end{figure}
+
+The energy saving percentages of NAS benchmarks with these three static power scenarios are presented 
+in figure \ref{fig:eng_sen}. This figure showed the 10\% of static power scenario 
+gives the biggest energy saving percentage comparing to 20\% and 30\% static power 
+scenario. When using smaller ratio of static power consumption, the proposed 
+frequencies selecting algorithm selects smaller frequencies, bigger scaling factors, 
+because the static energy consumption not increased significantly the overall energy 
+consumption. Therefore, more energy reduction can be achieved  when the frequencies are scaled down.
+For example figure \ref{fig:fre-pow}, illustrated that the proposed algorithm  
+proportionally scaled down the new computed frequencies with the overall predicted energy 
+consumption. The results of 30\% static power scenario gives the smallest energy saving percentages 
+because the new selected frequencies produced smaller ratio in the reduced energy consumption. 
+Furthermore, The proposed algorithm tries to limit selecting smaller frequencies that increased 
+the static energy consumption if the static power consumption is increased.
+The performance degradation percentages are presented in the figure \ref{fig:per-pow},
+the 30\% of static power scenario had less performance degradation percentage, because
+bigger frequencies was selected due to the big ratio in the static power consumption.
+The inverse was happens in the 20\% and 30\% scenario, the algorithm was selected 
+biggest frequencies, smaller scaling factors, according to this increased in the static power ratios.
+The tradoff distance for the NAS benchmarks with these three static powers scenarios 
+are presented in the figure \ref{fig:dist}. The results showed that the tradeoff
+distance is the best when the  10\% of static power scenario is used, and this percentage 
+is decreased for the other two scenarios propositionally to their static power ratios.
+In EP benchmarks, the results of energy saving, performance degradation and tradeoff 
+distance are showed small differences when the these static power scenarios were used, 
+because this benchmark not has communications. The proposed algorithm is selected 
+same frequencies in this benchmark when all these static power scenarios are used. 
+The small differences in the results are due to the static power is consumed during the computation 
+times side by side to the dynamic power consumption, knowing that the dynamic power consumption 
+representing the highest ratio in the total power consumption of the core, then any change in 
+the static power during these times have less affect on the overall energy consumption. While the 
+inverse was happens for the rest of the benchmarks which have the communications 
+that increased the static energy consumption linearly to the mount of communications 
+in these benchmarks.
+
+ 
+
+\subsection{The comparison of the proposed frequencies selecting algorithm }
  \label{sec.compare_EDP}
  
+The tradeoff between the energy consumption and the performance of the parallel 
+application had significant importance in the domain of the research. 
+Many researchers, \cite{EDP_for_multi_processors,Energy_aware_application_scheduling,Exploring_Energy_Performance_TradeOffs},
+are optimized the tradeoff between the energy and performance using the energy and delay product, $EDP=energy \times delay$. 
+This model is used by Spiliopoulos et al. algorithm \cite{Spiliopoulos_Green.governors.Adaptive.DVFS},
+the  objective is to selects the suitable frequencies that minimized EDP product for the multicores 
+architecture when DVFS is used. Moreover, their algorithm is applied online which synchronously optimized the energy consumption 
+and the execution time. Both energy consumption and execution time  of a processor are predicted by the their algorithm.
+In this section the proposed frequency selection algorithm, called Maxdist is compared with Spiliopoulos et al. algorithm, called EDP.
+To make both of the algorithms follow the same direction and  fairly  comparing them, the same energy model,  equation \ref{eq:energy} and
+the execution time model, equation \ref{eq:perf}, are used in the prediction process to select the best vector of the frequencies. 
+In contrast, the proposed algorithm starts the search space from the lower bound computed as in equation the  \ref{eq:Fint}. Also, the algorithm
+stops  the search process when  reaching to the lower bound as mentioned before. While, the EDP algorithm is developed to start from the 
+same upper bound until it reach to the minimum available frequencies. Finally, resulting the algorithm is an exhaustive search algorithm that
+test all possible frequencies, starting from the initial frequencies, and selecting those minimized the EDP products.
+
+Both algorithms were applied to NAS benchmarks class D over 16 nodes selected from grid'5000 clusters.
+The participating computing nodes are distributed between two sites to had two different scenarios. 
+These scenarios are two sites and one site scenarios that explained previously. 
+The experimental results of the energy saving, performance degradation and tradeoff distance are 
+presented in the figures \ref{fig:edp-eng}, \ref{fig:edp-perf} and \ref{fig:edp-dist} respectively. 
+
+In one site scenario the proposed frequencies selection algorithm outperform the EDP algorithm 
+in term of energy and performance for all of the benchmarks. While, the compassion results from the two sites scenario 
+showed that the proposed algorithm outperform EDP algorithm for all benchmarks except MG benchmark.
+In case of MG benchmark the are small communications and bigger frequencies selected in EDP algorithm 
+decreased the performance degradation more than the frequencies selected by Maxdist algorithm. 
+While the energy saving percentage are higher for Maxdist algorithm.
+
+Generally, the proposed algorithm gives better results for all benchmarks because it
+optimized the distance between the energy saving and the performance degradation. 
+Whereas, in EDP algorithm gives negative tradeoff for some benchmarks in the two sites scenarios.
+These negative tradeoffs mean the performance degradation percentage is higher than energy saving percentage.
+The higher positive value for tradeoff distance is mean the best energy and performance tradeoff is achieved synchronously, when
+the energy saving percentage is much higher than the performance degradation percentage 
+The time complexity of the proposed algorithm is $O(N \cdot M \cdot F)$, where $N$ is the number of the clusters,
+$M$ is the number of nodes and $F$ is the maximum number of available frequencies. The algorithm is selected 
+the best frequencies in small execution time, on average is equal to  0.01 $ms$ when it works over 32 nodes.
+While the EDP algorithm was slower than Maxdist algorithm by ten times, where their execution time  on average 
+takes 0.1 $ms$  to selects the suitable frequencies over 32 nodes. 
+The time complexity of this algorithm is  $O(N^2 \cdot M^2 \cdot F)$.
+
+
+  
+ 
+
+
+
+\begin{figure}
+  \centering
+  \includegraphics[scale=0.5]{fig/edp_eng}
+  \caption{Comparing of the energy saving for the proposed method with EDP method}
+  \label{fig:edp-eng}
+\end{figure}
+
+\begin{figure}
+  \centering
+  \includegraphics[scale=0.5]{fig/edp_per}
+  \caption{Comparing of the performance degradation for the proposed method with EDP method}
+  \label{fig:edp-perf}
+\end{figure}
  
+\begin{figure}
+  \centering
+  \includegraphics[scale=0.5]{fig/edp_dist}
+  \caption{Comparing of the tradeoff distance for the proposed method with EDP method}
+  \label{fig:edp-dist}
+\end{figure}
  
+  
  \section{Conclusion}
  \label{sec.concl}