-DVFS is a technique used in modern processors to scale down both the voltage and
-the frequency of the CPU while computing, in order to reduce the energy
-consumption of the processor. DVFS is also allowed in GPUs to achieve the same
-goal. Reducing the frequency of a processor lowers its number of FLOPS and might
-degrade the performance of the application running on that processor, especially
-if it is compute bound. Therefore selecting the appropriate frequency for a
-processor to satisfy some objectives while taking into account all the
-constraints, is not a trivial operation. Many researchers used different
-strategies to tackle this problem. Some of them developed online methods that
-compute the new frequency while executing the application, such
-as~\cite{Hao_Learning.based.DVFS,Spiliopoulos_Green.governors.Adaptive.DVFS}.
-Others used offline methods that might need to run the application and profile
-it before selecting the new frequency, such
-as~\cite{Rountree_Bounding.energy.consumption.in.MPI,Cochran_Pack_and_Cap_Adaptive_DVFS}.
-The methods could be heuristics, exact or brute force methods that satisfy
-varied objectives such as energy reduction or performance. They also could be
-adapted to the execution's environment and the type of the application such as
-sequential, parallel or distributed architecture, homogeneous or heterogeneous
-platform, synchronous or asynchronous application, \dots{}
-
-In this paper, we are interested in reducing energy for message passing
-iterative synchronous applications running over heterogeneous platforms. Some
-works have already been done for such platforms and they can be classified into
-two types of heterogeneous platforms:
-\begin{itemize}
-\item the platform is composed of homogeneous GPUs and homogeneous CPUs.
-\item the platform is only composed of heterogeneous CPUs.
-\end{itemize}
-
-For the first type of platform, the computing intensive parallel tasks are
-executed on the GPUs and the rest are executed on the CPUs. Luley et
-al.~\cite{Luley_Energy.efficiency.evaluation.and.benchmarking}, proposed a
-heterogeneous cluster composed of Intel Xeon CPUs and NVIDIA GPUs. Their main
-goal was to maximize the energy efficiency of the platform during computation by
-maximizing the number of FLOPS per watt generated.
-In~\cite{KaiMa_Holistic.Approach.to.Energy.Efficiency.in.GPU-CPU}, Kai Ma et
-al. developed a scheduling algorithm that distributes workloads proportional to
-the computing power of the nodes which could be a GPU or a CPU. All the tasks
-must be completed at the same time. In~\cite{Rong_Effects.of.DVFS.on.K20.GPU},
-Rong et al. showed that a heterogeneous (GPUs and CPUs) cluster that enables
-DVFS gave better energy and performance efficiency than other clusters only
-composed of CPUs.
-
-The work presented in this paper concerns the second type of platform, with
-heterogeneous CPUs. Many methods were conceived to reduce the energy
-consumption of this type of platform. Naveen et
-al.~\cite{Naveen_Power.Efficient.Resource.Scaling} developed a method that
-minimizes the value of $\mathit{energy}\times \mathit{delay}^2$ (the delay is
-the sum of slack times that happen during synchronous communications) by
-dynamically assigning new frequencies to the CPUs of the heterogeneous cluster.
-Lizhe et al.~\cite{Lizhe_Energy.aware.parallel.task.scheduling} proposed an
-algorithm that divides the executed tasks into two types: the critical and non
-critical tasks. The algorithm scales down the frequency of non critical tasks
-proportionally to their slack and communication times while limiting the
-performance degradation percentage to less than \np[\%]{10}.
-In~\cite{Joshi_Blackbox.prediction.of.impact.of.DVFS}, they developed a
-heterogeneous cluster composed of two types of Intel and AMD processors. They
-use a gradient method to predict the impact of DVFS operations on performance.
-In~\cite{Shelepov_Scheduling.on.Heterogeneous.Multicore} and
-\cite{Li_Minimizing.Energy.Consumption.for.Frame.Based.Tasks}, the best
-frequencies for a specified heterogeneous cluster are selected offline using
-some heuristic. Chen et
-al.~\cite{Chen_DVFS.under.quality.of.service.requirements} used a greedy dynamic
-programming approach to minimize the power consumption of heterogeneous servers
-while respecting given time constraints. This approach had considerable
-overhead. In contrast to the above described papers, this paper presents the
-following contributions :
-\begin{enumerate}
-\item two new energy and performance models for message passing iterative
- synchronous applications running over a heterogeneous platform. Both models
- take into account communication and slack times. The models can predict the
- required energy and the execution time of the application.
-
-\item a new online frequency selecting algorithm for heterogeneous
- platforms. The algorithm has a very small overhead and does not need any
- training or profiling. It uses a new optimization function which
- simultaneously maximizes the performance and minimizes the energy consumption
- of a message passing iterative synchronous application.
-
-\end{enumerate}