X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/mpi-energy.git/blobdiff_plain/f52703a2c13a4e69525be6ce9883eead2f96eb6f..902f3596f3c6f21def5c1d8374302b0962ad9345:/paper.tex diff --git a/paper.tex b/paper.tex index 2838f2d..a7a269f 100644 --- a/paper.tex +++ b/paper.tex @@ -3,7 +3,7 @@ \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} \usepackage[english]{babel} -\usepackage{algorithm,algorithmicx,algpseudocode} +\usepackage{algpseudocode} \usepackage{graphicx,graphics} \usepackage{subfig} \usepackage{listings} @@ -359,12 +359,10 @@ performance as follows: \end{equation} \begin{figure} \centering - \subfloat[Converted relation.]{% - \includegraphics[width=.24\textwidth]{fig/file}\label{fig:r1}}% -% \quad% \subfloat[Real relation.]{% - \includegraphics[width=.24\textwidth]{fig/file3}\label{fig:r2}} - \label{fig:rel} + \includegraphics[width=.5\linewidth]{fig/file3}\label{fig:r2}}% + \subfloat[Converted relation.]{% + \includegraphics[width=.5\linewidth]{fig/file}\label{fig:r1}} \caption{The energy and performance relation} \end{figure} Then, we can model our objective function as finding the maximum distance @@ -375,7 +373,7 @@ performance) at the same time, see Figure~(\ref{fig:r1}). Then our objective function has the following form: \begin{equation} \label{eq:max} - Max Dist = \max_{j=1,2,\dots,F} + \textit{Max Dist} = \max_{j=1,2,\dots,F} (\overbrace{P^{-1}_\textit{Norm}(S_j)}^{\text{Maximize}} - \overbrace{E_\textit{Norm}(S_j)}^{\text{Minimize}} ) \end{equation} @@ -388,12 +386,11 @@ form over the available frequency scaling factors as shown in~\cite{15,3,19}. \section{Optimal scaling factor for performance and energy} \label{sec.optim} -Algorithm~\ref{EPSA} computes the optimal scaling factor according to the -objective function described above. -\begin{algorithm}[tp] - \caption{Scaling factor selection algorithm} - \label{EPSA} +Algorithm on Figure~\ref{EPSA} computes the optimal scaling factor according to +the objective function described above. +\begin{figure}[tp] \begin{algorithmic}[1] + % \footnotesize \State Initialize the variable $Dist=0$ \State Set dynamic and static power values. \State Set $P_{states}$ to the number of available frequencies. @@ -421,19 +418,22 @@ objective function described above. \EndFor \State Return $S_{opt}$ \end{algorithmic} -\end{algorithm} + \caption{Scaling factor selection algorithm} + \label{EPSA} +\end{figure} The proposed algorithm works online during the execution time of the MPI program. It selects the optimal scaling factor after gathering the computation and communication times from the program after one iteration. Then the program changes the new frequencies of the CPUs according to the computed scaling -factors. In our experiments over a homogeneous cluster described in section~\ref{sec.expe}, -this algorithm has a small execution time. It takes \np[$\mu$s]{1.52} on average for 4 nodes and -\np[$\mu$s]{6.65} on average for 32 nodes. The algorithm complexity is $O(F\cdot -N)$, where $F$ is the number of available frequencies and $N$ is the number of -computing nodes. The algorithm is called just once during the execution of the -program. The DVFS algorithm~(\ref{dvfs}) shows where and when the algorithm is -called in the MPI program. +factors. In our experiments over a homogeneous cluster described in +section~\ref{sec.expe}, this algorithm has a small execution time. It takes +\np[$\mu$s]{1.52} on average for 4 nodes and \np[$\mu$s]{6.65} on average for 32 +nodes. The algorithm complexity is $O(F\cdot N)$, where $F$ is the number of +available frequencies and $N$ is the number of computing nodes. The algorithm +is called just once during the execution of the program. The DVFS +algorithm~(Fig.~\ref{dvfs}) shows where and when the algorithm is called in the +MPI program. %\begin{table}[htb] % \caption{Platform file parameters} % % title of Table @@ -450,24 +450,25 @@ called in the MPI program. % \label{table:platform} %\end{table} -\begin{algorithm}[tp] - \caption{DVFS} - \label{dvfs} +\begin{figure}[tp] \begin{algorithmic}[1] + % \footnotesize \For {$k:=1$ to \textit{some iterations}} \State Computations section. \State Communications section. \If {$(k=1)$} \State Gather all times of computation and\newline\hspace*{3em}% communication from each node. - \State Call algorithm~\ref{EPSA} with these times. + \State Call algorithm from Figure~\ref{EPSA} with these times. \State Compute the new frequency from the\newline\hspace*{3em}% returned optimal scaling factor. \State Set the new frequency to the CPU. \EndIf \EndFor \end{algorithmic} -\end{algorithm} + \caption{DVFS algorithm} + \label{dvfs} +\end{figure} After obtaining the optimal scaling factor, the program calculates the new frequency $F_i$ for each task proportionally to its time value $T_i$. By substitution of EQ~(\ref{eq:s}) in EQ~(\ref{eq:si}), we can calculate the new @@ -501,10 +502,10 @@ execution time values. These scaling factors are computed by dividing the maximum frequency by the new one see EQ~(\ref{eq:s}). \begin{figure} \centering - \includegraphics[width=.24\textwidth]{fig/cg_per}\hfill% - % \includegraphics[width=.328\textwidth]{fig/mg_pre}\hfill% - % \includegraphics[width=.4\textwidth]{fig/bt_pre}\qquad% - \includegraphics[width=.24\textwidth]{fig/lu_pre}\hfill% + \includegraphics[width=.5\linewidth]{fig/cg_per}\hfill% + % \includegraphics[width=.5\linewidth]{fig/mg_pre}\hfill% + % \includegraphics[width=.5\linewidth]{fig/bt_pre}\qquad% + \includegraphics[width=.5\linewidth]{fig/lu_pre}\hfill% \caption{Comparing predicted to real execution times} \label{fig:pred} \end{figure} @@ -546,12 +547,12 @@ factors give the maximum energy saving percentage and the minimum performance degradation percentage at the same time from all available scaling factors. \begin{figure*}[t] \centering - \includegraphics[width=.33\textwidth]{fig/ep}\hfill% - \includegraphics[width=.33\textwidth]{fig/cg}\hfill% - % \includegraphics[width=.328\textwidth]{fig/sp} - % \includegraphics[width=.328\textwidth]{fig/lu}\hfill% - \includegraphics[width=.33\textwidth]{fig/bt}\hfill% - % \includegraphics[width=.328\textwidth]{fig/ft} + \includegraphics[width=.33\linewidth]{fig/ep}\hfill% + \includegraphics[width=.33\linewidth]{fig/cg}\hfill% + % \includegraphics[width=.328\linewidth]{fig/sp} + % \includegraphics[width=.328\linewidth]{fig/lu}\hfill% + \includegraphics[width=.33\linewidth]{fig/bt} + % \includegraphics[width=.328\linewidth]{fig/ft} \caption{Optimal scaling factors for the predicted energy and performance of NAS benchmarks} \label{fig:nas} \end{figure*} @@ -635,9 +636,9 @@ while Rauber and Rünger's method, ($R_{E-P}$), gives sometimes negative trade-offs such as in BT and EP. \begin{figure}[t] \centering -% \includegraphics[width=.328\textwidth]{fig/compare_class_A} -% \includegraphics[width=.328\textwidth]{fig/compare_class_B} - \includegraphics[width=.49\textwidth]{fig/compare_class_C} +% \includegraphics[width=.328\linewidth]{fig/compare_class_A} +% \includegraphics[width=.328\linewidth]{fig/compare_class_B} + \includegraphics[width=\linewidth]{fig/compare_class_C} \caption{Comparing our method to Rauber and Rünger's methods} \label{fig:compare} \end{figure}