-\subsection{The algorithm details}
-In this section, algorithm \ref{HSA} is presented. It selects the frequency scaling factors
-vector that gives the best trade-off between minimizing the energy consumption and maximizing
-the performance of a message passing synchronous iterative application executed on a heterogeneous
-platform. It works online during the execution time of the iterative message passing program.
-It uses information gathered during the first iteration such as the computation time and the
-communication time in one iteration for each node. The algorithm is executed after the first
-iteration and returns a vector of optimal frequency scaling factors that satisfies the objective
-function (\ref{eq:max}). The program applies DVFS operations to change the frequencies of the CPUs
-according to the computed scaling factors. This algorithm is called just once during the execution
-of the program. Algorithm~(\ref{dvfs}) shows where and when the proposed scaling algorithm is called
-in the iterative MPI program.
-
-The nodes in a heterogeneous platform have different computing powers, thus while executing message
-passing iterative synchronous applications, fast nodes have to wait for the slower ones to finish their
-computations before being able to synchronously communicate with them as in figure (\ref{fig:heter}).
-These periods are called idle or slack times.
-The algorithm takes into account this problem and tries to reduce these slack times when selecting the
-frequency scaling factors vector. At first, it selects initial frequency scaling factors that increase
-the execution times of fast nodes and minimize the differences between the computation times of
-fast and slow nodes. The value of the initial frequency scaling factor for each node is inversely
-proportional to its computation time that was gathered from the first iteration. These initial frequency
-scaling factors are computed as a ratio between the computation time of the slowest node and the
-computation time of the node $i$ as follows:
-\begin{equation}
- \label{eq:Scp}
- Scp_{i} = \frac{\max_{i=1,2,\dots,N}(Tcp_i)}{Tcp_i}
-\end{equation}
-Using the initial frequency scaling factors computed in (\ref{eq:Scp}), the algorithm computes
-the initial frequencies for all nodes as a ratio between the maximum frequency of node $i$
-and the computation scaling factor $Scp_i$ as follows:
-\begin{equation}
- \label{eq:Fint}
- F_{i} = \frac{Fmax_i}{Scp_i},~{i=1,2,\cdots,N}
-\end{equation}
-If the computed initial frequency for a node is not available in the gears of
-that node, it is replaced by the nearest available frequency. In figure
-(\ref{fig:st_freq}), the nodes are sorted by their computing power in ascending
-order and the frequencies of the faster nodes are scaled down according to the
-computed initial frequency scaling factors. The resulting new frequencies are
-colored in blue in figure (\ref{fig:st_freq}). This set of frequencies can be
-considered as a higher bound for the search space of the optimal vector of
-frequencies because selecting frequency scaling factors higher than the higher
-bound will not improve the performance of the application and it will increase
-its overall energy consumption. Therefore the algorithm that selects the
-frequency scaling factors starts the search method from these initial
-frequencies and takes a downward search direction toward lower frequencies. The
-algorithm iterates on all left frequencies, from the higher bound until all
-nodes reach their minimum frequencies, to compute their overall energy
-consumption and performance, and select the optimal frequency scaling factors
-vector. At each iteration the algorithm determines the slowest node according to
-the equation (\ref{eq:perf}) and keeps its frequency unchanged, while it lowers
-the frequency of all other nodes by one gear. The new overall energy
-consumption and execution time are computed according to the new scaling
-factors. The optimal set of frequency scaling factors is the set that gives the
-highest distance according to the objective function (\ref{eq:max}).
-
-Figures~\ref{fig:r1} and \ref{fig:r2} illustrate the normalized performance and
-consumed energy for an application running on a homogeneous platform and a
-heterogeneous platform respectively while increasing the scaling factors. It can
-be noticed that in a homogeneous platform the search for the optimal scaling
-factor should start from the maximum frequency because the performance and the
-consumed energy decrease from the beginning of the plot. On the other hand,
-in the heterogeneous platform the performance is maintained at the beginning of
-the plot even if the frequencies of the faster nodes decrease until the
-computing power of scaled down nodes are lower than the slowest node. In other
-words, until they reach the higher bound. It can also be noticed that the higher
-the difference between the faster nodes and the slower nodes is, the bigger the
-maximum distance between the energy curve and the performance curve is while
- the scaling factors are varying which results in bigger energy savings.
-\begin{figure}[t]
- \centering
- \includegraphics[scale=0.5]{fig/start_freq}
- \caption{Selecting the initial frequencies}
- \label{fig:st_freq}
-\end{figure}
-
-
-
-