-\subsection{The algorithm details}
-In this section, algorithm \ref{HSA} is presented. It selects the frequency scaling factors
-vector that gives the best trade-off between minimizing the energy consumption and maximizing
-the performance of a message passing synchronous iterative application executed on a heterogeneous
-platform. It works online during the execution time of the iterative message passing program.
-It uses information gathered during the first iteration such as the computation time and the
-communication time in one iteration for each node. The algorithm is executed after the first
-iteration and returns a vector of optimal frequency scaling factors that satisfies the objective
-function (\ref{eq:max}). The program applies DVFS operations to change the frequencies of the CPUs
-according to the computed scaling factors. This algorithm is called just once during the execution
-of the program. Algorithm~(\ref{dvfs}) shows where and when the proposed scaling algorithm is called
-in the iterative MPI program.
-
-The nodes in a heterogeneous platform have different computing powers, thus while executing message
-passing iterative synchronous applications, fast nodes have to wait for the slower ones to finish their
-computations before being able to synchronously communicate with them as in figure (\ref{fig:heter}).
-These periods are called idle or slack times.
-The algorithm takes into account this problem and tries to reduce these slack times when selecting the
-frequency scaling factors vector. At first, it selects initial frequency scaling factors that increase
-the execution times of fast nodes and minimize the differences between the computation times of
-fast and slow nodes. The value of the initial frequency scaling factor for each node is inversely
-proportional to its computation time that was gathered from the first iteration. These initial frequency
-scaling factors are computed as a ratio between the computation time of the slowest node and the
-computation time of the node $i$ as follows:
-\begin{equation}
- \label{eq:Scp}
- Scp_{i} = \frac{\max_{i=1,2,\dots,N}(Tcp_i)}{Tcp_i}
-\end{equation}
-Using the initial frequency scaling factors computed in (\ref{eq:Scp}), the algorithm computes
-the initial frequencies for all nodes as a ratio between the maximum frequency of node $i$
-and the computation scaling factor $Scp_i$ as follows:
-\begin{equation}
- \label{eq:Fint}
- F_{i} = \frac{Fmax_i}{Scp_i},~{i=1,2,\cdots,N}
-\end{equation}
-If the computed initial frequency for a node is not available in the gears of that node, the computed
-initial frequency is replaced by the nearest available frequency. In figure (\ref{fig:st_freq}),
-the nodes are sorted by their computing powers in ascending order and the frequencies of the faster
-nodes are scaled down according to the computed initial frequency scaling factors. The resulting new
-frequencies are colored in blue in figure (\ref{fig:st_freq}). This set of frequencies can be considered
-as a higher bound for the search space of the optimal vector of frequencies because selecting frequency
-scaling factors higher than the higher bound will not improve the performance of the application and
-it will increase its overall energy consumption. Therefore the algorithm that selects the frequency
-scaling factors starts the search method from these initial frequencies and takes a downward search direction
-toward lower frequencies. The algorithm iterates on all left frequencies, from the higher bound until all
-nodes reach their minimum frequencies, to compute their overall energy consumption and performance, and select
-the optimal frequency scaling factors vector. At each iteration the algorithm determines the slowest node
-according to the equation (\ref{eq:perf}) and keeps its frequency unchanged, while it lowers the frequency of
-all other nodes by one gear.
-The new overall energy consumption and execution time are computed according to the new scaling factors.
-The optimal set of frequency scaling factors is the set that gives the highest distance according to the objective
-function (\ref{eq:max}).
-
-The plots~(\ref{fig:r1} and \ref{fig:r2}) illustrate the normalized performance and consumed energy for an
-application running on a homogeneous platform and a heterogeneous platform respectively while increasing the
-scaling factors. It can be noticed that in a homogeneous platform the search for the optimal scaling factor
-should be started from the maximum frequency because the performance and the consumed energy is decreased since
-the beginning of the plot. On the other hand, in the heterogeneous platform the performance is maintained at
-the beginning of the plot even if the frequencies of the faster nodes are decreased until the scaled down nodes
-have computing powers lower than the slowest node. In other words, until they reach the higher bound. It can
-also be noticed that the higher the difference between the faster nodes and the slower nodes is, the bigger
-the maximum distance between the energy curve and the performance curve is while varying the scaling factors
-which results in bigger energy savings.
-\begin{figure}[t]
- \centering
- \includegraphics[scale=0.5]{fig/start_freq}
- \caption{Selecting the initial frequencies}
- \label{fig:st_freq}
-\end{figure}
-
-
-
-