\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
-\usepackage{algorithm,algorithmicx,algpseudocode}
+\usepackage{algpseudocode}
\usepackage{graphicx,graphics}
\usepackage{subfig}
\usepackage{listings}
\section{Optimal scaling factor for performance and energy}
\label{sec.optim}
-Algorithm~\ref{EPSA} computes the optimal scaling factor according to the
-objective function described above.
-\begin{algorithm}[tp]
- \caption{Scaling factor selection algorithm}
- \label{EPSA}
+Algorithm on Figure~\ref{EPSA} computes the optimal scaling factor according to
+the objective function described above.
+\begin{figure}[tp]
\begin{algorithmic}[1]
+ % \footnotesize
\State Initialize the variable $Dist=0$
\State Set dynamic and static power values.
\State Set $P_{states}$ to the number of available frequencies.
\EndFor
\State Return $S_{opt}$
\end{algorithmic}
-\end{algorithm}
+ \caption{Scaling factor selection algorithm}
+ \label{EPSA}
+\end{figure}
The proposed algorithm works online during the execution time of the MPI
program. It selects the optimal scaling factor after gathering the computation
and communication times from the program after one iteration. Then the program
changes the new frequencies of the CPUs according to the computed scaling
-factors. In our experiments over a homogeneous cluster described in section~\ref{sec.expe},
-this algorithm has a small execution time. It takes \np[$\mu$s]{1.52} on average for 4 nodes and
-\np[$\mu$s]{6.65} on average for 32 nodes. The algorithm complexity is $O(F\cdot
-N)$, where $F$ is the number of available frequencies and $N$ is the number of
-computing nodes. The algorithm is called just once during the execution of the
-program. The DVFS algorithm~(\ref{dvfs}) shows where and when the algorithm is
-called in the MPI program.
+factors. In our experiments over a homogeneous cluster described in
+section~\ref{sec.expe}, this algorithm has a small execution time. It takes
+\np[$\mu$s]{1.52} on average for 4 nodes and \np[$\mu$s]{6.65} on average for 32
+nodes. The algorithm complexity is $O(F\cdot N)$, where $F$ is the number of
+available frequencies and $N$ is the number of computing nodes. The algorithm
+is called just once during the execution of the program. The DVFS
+algorithm~(Fig.~\ref{dvfs}) shows where and when the algorithm is called in the
+MPI program.
%\begin{table}[htb]
% \caption{Platform file parameters}
% % title of Table
% \label{table:platform}
%\end{table}
-\begin{algorithm}[tp]
- \caption{DVFS}
- \label{dvfs}
+\begin{figure}[tp]
\begin{algorithmic}[1]
+ % \footnotesize
\For {$k:=1$ to \textit{some iterations}}
\State Computations section.
\State Communications section.
\If {$(k=1)$}
\State Gather all times of computation and\newline\hspace*{3em}%
communication from each node.
- \State Call algorithm~\ref{EPSA} with these times.
+ \State Call algorithm from Figure~\ref{EPSA} with these times.
\State Compute the new frequency from the\newline\hspace*{3em}%
returned optimal scaling factor.
\State Set the new frequency to the CPU.
\EndIf
\EndFor
\end{algorithmic}
-\end{algorithm}
+ \caption{DVFS algorithm}
+ \label{dvfs}
+\end{figure}
After obtaining the optimal scaling factor, the program calculates the new
frequency $F_i$ for each task proportionally to its time value $T_i$. By
substitution of EQ~(\ref{eq:s}) in EQ~(\ref{eq:si}), we can calculate the new