X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/mpi-energy2.git/blobdiff_plain/9605046971a2c8e2559533218ec4dfde7654c465..b06b270b9964583f76b38ef50528eb6e00a6881b:/Heter_paper.tex?ds=sidebyside diff --git a/Heter_paper.tex b/Heter_paper.tex index f34524c..d318912 100644 --- a/Heter_paper.tex +++ b/Heter_paper.tex @@ -8,7 +8,7 @@ \usepackage{algorithm} \usepackage{subfig} \usepackage{amsmath} - +\usepackage{multirow} \usepackage{url} \DeclareUrlCommand\email{\urlstyle{same}} @@ -207,7 +207,7 @@ task which have the highest computation time and no slack time. \begin{figure}[t] \centering - \includegraphics[scale=0.6]{fig/commtasks} + \includegraphics[scale=0.5]{fig/commtasks} \caption{Parallel tasks on a heterogeneous platform} \label{fig:heter} \end{figure} @@ -266,7 +266,7 @@ by the number of iterations of that application. This prediction model is developed from the model for predicting the execution time of message passing distributed applications for homogeneous architectures~\cite{Our_first_paper}. -The execution time prediction model is used in the method for optimizing both +The execution time prediction model is uSpiliopoulossed in the method for optimizing both energy consumption and performance of iterative methods, which is presented in the following sections. @@ -670,16 +670,16 @@ Finally, These nodes were connected via an ethernet network with 1 Gbit/s bandwi & & GHz & GHz &GHz & & \\ \hline 1 &40 & 2.5 & 1.2 & 0.1 & 20~w &4~w \\ - & & & & & & \\ + \hline 2 &50 & 2.66 & 1.6 & 0.133 & 25~w &5~w \\ - & & & & & & \\ + \hline 3 &60 & 2.9 & 1.2 & 0.1 & 30~w &6~w \\ - & & & & & & \\ + \hline 4 &70 & 3.4 & 1.6 & 0.133 & 35~w &7~w \\ - & & & & & & \\ + \hline \end{tabular} \label{table:platform} @@ -708,7 +708,7 @@ The other benchmarks such as BT and SP should be executed on $1, 4, 9, 16, 36, 6 \centering \begin{tabular}{|*{7}{l|}} \hline - Method & Execution & Energy & Energy & Performance & Distance \\ + Program & Execution & Energy & Energy & Performance & Distance \\ name & time/s & consumption/J & saving\% & degradation\% & \\ \hline CG & 64.64 & 3560.39 &34.16 &6.72 &27.44 \\ @@ -735,7 +735,7 @@ The other benchmarks such as BT and SP should be executed on $1, 4, 9, 16, 36, 6 \centering \begin{tabular}{|*{7}{l|}} \hline - Method & Execution & Energy & Energy & Performance & Distance \\ + Program & Execution & Energy & Energy & Performance & Distance \\ name & time/s & consumption/J & saving\% & degradation\% & \\ \hline CG &36.11 &3263.49 &31.25 &7.12 &24.13 \\ @@ -762,7 +762,7 @@ The other benchmarks such as BT and SP should be executed on $1, 4, 9, 16, 36, 6 \centering \begin{tabular}{|*{7}{l|}} \hline - Method & Execution & Energy & Energy & Performance & Distance \\ + Program & Execution & Energy & Energy & Performance & Distance \\ name & time/s & consumption/J & saving\% & degradation\% & \\ \hline CG &31.74 &4373.90 &26.29 &9.57 &16.72 \\ @@ -789,7 +789,7 @@ The other benchmarks such as BT and SP should be executed on $1, 4, 9, 16, 36, 6 \centering \begin{tabular}{|*{7}{l|}} \hline - Method & Execution & Energy & Energy & Performance & Distance \\ + Program & Execution & Energy & Energy & Performance & Distance \\ name & time/s & consumption/J & saving\% & degradation\% & \\ \hline CG &32.35 &6704.21 &16.15 &5.30 &10.85 \\ @@ -816,7 +816,7 @@ The other benchmarks such as BT and SP should be executed on $1, 4, 9, 16, 36, 6 \centering \begin{tabular}{|*{7}{l|}} \hline - Method & Execution & Energy & Energy & Performance & Distance \\ + Program & Execution & Energy & Energy & Performance & Distance \\ name & time/s & consumption/J & saving\% & degradation\% & \\ \hline CG &46.65 &17521.83 &8.13 &1.68 &6.45 \\ @@ -844,7 +844,7 @@ The other benchmarks such as BT and SP should be executed on $1, 4, 9, 16, 36, 6 \centering \begin{tabular}{|*{7}{l|}} \hline - Method & Execution & Energy & Energy & Performance & Distance \\ + Program & Execution & Energy & Energy & Performance & Distance \\ name & time/s & consumption/J & saving\% & degradation\% & \\ \hline CG &56.92 &41163.36 &4.00 &1.10 &2.90 \\ @@ -964,7 +964,7 @@ results in less energy saving but less performance degradation. \centering \begin{tabular}{|*{6}{l|}} \hline - Method & Energy & Energy & Performance & Distance \\ + Program & Energy & Energy & Performance & Distance \\ name & consumption/J & saving\% & degradation\% & \\ \hline CG &4144.21 &22.42 &7.72 &14.70 \\ @@ -993,7 +993,7 @@ results in less energy saving but less performance degradation. \centering \begin{tabular}{|*{6}{l|}} \hline - Method & Energy & Energy & Performance & Distance \\ + Program & Energy & Energy & Performance & Distance \\ name & consumption/J & saving\% & degradation\% & \\ \hline CG &2812.38 &36.36 &6.80 &29.56 \\ @@ -1017,11 +1017,11 @@ results in less energy saving but less performance degradation. \begin{figure} \centering - \subfloat[Comparison the average of the results on 8 nodes]{% - \includegraphics[width=.33\textwidth]{fig/sen_comp}\label{fig:sen_comp}}% + \subfloat[Comparison of the results on 8 nodes]{% + \includegraphics[width=.30\textwidth]{fig/sen_comp}\label{fig:sen_comp}}% \subfloat[Comparison the selected frequency scaling factors of MG benchmark class C running on 8 nodes]{% - \includegraphics[width=.33\textwidth]{fig/three_scenarios}\label{fig:scales_comp}} + \includegraphics[width=.34\textwidth]{fig/three_scenarios}\label{fig:scales_comp}} \label{fig:comp} \caption{The comparison of the three power scenarios} \end{figure} @@ -1029,6 +1029,59 @@ results in less energy saving but less performance degradation. +\subsection{The comparison of the proposed scaling algorithm } +\label{sec.compare_EDP} + +In this section, we compare our scaling factors selection algorithm +with Spiliopoulos et al. algorithm \cite{Spiliopoulos_Green.governors.Adaptive.DVFS}. +They developed an online frequency selecting algorithm running over multicore architecture. +The algorithm predicted both the energy and performance during the runtime of the program, then +selecting the frequencies that minimized the energy and delay products (EDP), $EDP=Enegry * Delay$. +To be able to compare with this algorithm, we used our energy and execution time models in prediction process, +equations (\ref{eq:energy}) and (\ref{eq:fnew}). Also their algorithm is adapted to taking into account +the heterogeneous platform to starts selecting the +initial frequencies using the equation (\ref{eq:Fint}). The algorithm built to test all possible frequencies as +a brute-force search algorithm. + +The comparison results of running NAS benchmarks class C on 8 or 9 nodes are +presented in table \ref{table:compare_EDP}. The results show that our algorithm has a biggest energy saving percentage, +on average it has 29.76\% and thier algorithm has 25.75\%, +while the average of performance degradation percentage is approximately the same, the average for our algorithm is +equal to 3.89\% and for their algorithm is equal to 4.03\%. In general, our algorithm outperforms +Spiliopoulos et al. algorithm in term of energy and performance tradeoff see figure (\ref{fig:compare_EDP}). +This because our algorithm maximized the difference (the distance) between the energy saving and the performance degradation +comparing to their EDP optimization function. It is also keeps the frequency of the slowest node without change +that gave some enhancements to the energy and performance tradeoff. + + +\begin{table}[h] + \caption{Comparing the proposed algorithm} + \centering +\begin{tabular}{|l|l|l|l|l|l|l|l|} +\hline +\multicolumn{2}{|l|}{\multirow{2}{*}{\begin{tabular}[c]{@{}l@{}}Program \\ name\end{tabular}}} & \multicolumn{2}{l|}{Energy saving \%} & \multicolumn{2}{l|}{Perf. degradation \%} & \multicolumn{2}{l|}{Distance} \\ \cline{3-8} +\multicolumn{2}{|l|}{} & EDP & MaxDist & EDP & MaxDist & EDP & MaxDist \\ \hline +\multicolumn{2}{|l|}{CG} & 27.58 & 31.25 & 5.82 & 7.12 & 21.76 & 24.13 \\ \hline +\multicolumn{2}{|l|}{MG} & 29.49 & 33.78 & 3.74 & 6.41 & 25.75 & 27.37 \\ \hline +\multicolumn{2}{|l|}{LU} & 19.55 & 28.33 & 0.0 & 0.01 & 19.55 & 28.22 \\ \hline +\multicolumn{2}{|l|}{EP} & 28.40 & 27.04 & 4.29 & 0.49 & 24.11 & 26.55 \\ \hline +\multicolumn{2}{|l|}{BT} & 27.68 & 32.32 & 6.45 & 7.87 & 21.23 & 24.43 \\ \hline +\multicolumn{2}{|l|}{SP} & 20.52 & 24.73 & 5.21 & 2.78 & 15.31 & 21.95 \\ \hline +\multicolumn{2}{|l|}{FT} & 27.03 & 31.02 & 2.75 & 2.54 & 24.28 & 28.48 \\ \hline + +\end{tabular} +\label{table:compare_EDP} +\end{table} + + + +\begin{figure}[t] + \centering + \includegraphics[scale=0.6]{fig/compare_EDP.pdf} + \caption{Tradeoff comparison for NAS benchmarks class C} + \label{fig:compare_EDP} +\end{figure} + \section{Conclusion} \label{sec.concl} @@ -1044,6 +1097,10 @@ known in advance and depends on the global convergence of the iterative system. \section*{Acknowledgment} +This work has been partially supported by the Labex +ACTION project (contract “ANR-11-LABX-01-01”). As a PhD student, +Mr. Ahmed Fanfakh, would like to thank the University of +Babylon (Iraq) for supporting his work. % trigger a \newpage just before the given reference