-Energy reduction process for a high performance clusters recently performed using
-dynamic voltage and frequency scaling (DVFS) technique. DVFS is a technique enabled
-in a modern processors to scaled down both of the voltage and the frequency of
-the CPU while it is in the computing mode to reduce the energy consumption. DVFS is
-also allowed in the graphical processors GPUs, to achieved the same goal. Applying
-DVFS has a dramatical side effect if it is applied to minimum levels to gain more
-energy reduction, producing a high percentage of performance degradations for the
-parallel applications. Many researchers used different strategies to solve this
-nonlinear problem for example in
-~\cite{Hao_Learning.based.DVFS,Dhiman_Online.Learning.Power.Management}, their methods
-add big overheads to the algorithm to select the suitable frequency.
-In this paper we present a method
-to find the optimal set of frequency scaling factors for a heterogeneous cluster to
-simultaneously optimize both the energy and the execution time without adding a big
-overhead. This work is developed from our previous work of a homogeneous cluster~\cite{Our_first_paper}.
-Therefore we are interested to present some works that concerned the heterogeneous clusters
-enabled DVFS. In general, the heterogeneous cluster works fall into two categorizes:
-GPUs-CPUs heterogeneous clusters and CPUs-CPUs heterogeneous clusters. In GPUs-CPUs
-heterogeneous clusters some parallel tasks executed on a GPUs and the others executed
-on a CPUs. As an example of this works, Luley et al.
-~\cite{Luley_Energy.efficiency.evaluation.and.benchmarking}, proposed a heterogeneous
-cluster composed of Intel Xeon CPUs and NVIDIA GPUs. Their main goal is to determined the
-energy efficiency as a function of performance per watt, the best tradeoff is done when the
-performance per watt function is maximized. In the work of Kia Ma et al.
-~\cite{KaiMa_Holistic.Approach.to.Energy.Efficiency.in.GPU-CPU}, They developed a scheduling
-algorithm to distributed different workloads proportional to the computing power of the node
-to be executed on a CPU or a GPU, emphasize all tasks must be finished in the same time.
-Recently, Rong et al.~\cite{Rong_Effects.of.DVFS.on.K20.GPU}, Their study explain that
-a heterogeneous clusters enabled DVFS using GPUs and CPUs gave better energy and performance
-efficiency than other clusters composed of only CPUs.
-The CPUs-CPUs heterogeneous clusters consist of number of computing nodes all of the type CPU.
-Our work in this paper can be classified to this type of the clusters.
-As an example of this works see Naveen et al.~\cite{Naveen_Power.Efficient.Resource.Scaling} work,
-They developed a policy to dynamically assigned the frequency to a heterogeneous cluster.
-The goal is to minimizing a fixed metric of $energy*delay^2$. Where our proposed method is automatically
-optimized the relation between the energy and the delay of the iterative applications.
-Other works such as Lizhe et al.~\cite{Lizhe_Energy.aware.parallel.task.scheduling},
-their algorithm divided the executed tasks into two types: the critical and
-non critical tasks. The algorithm scaled down the frequency of the non critical tasks
-as function to the amount of the slack and communication times that
-have with maximum of performance degradation percentage of 10\%. In our method there is no
-fixed bounds for performance degradation percentage and the bound is dynamically computed
-according to the energy and the performance tradeoff relation of the executed application.
-There are some approaches used a heterogeneous cluster composed from two different types
-of Intel and AMD processors such as~\cite{Joshi_Blackbox.prediction.of.impact.of.DVFS}
-and \cite{Spiliopoulos_Green.governors.Adaptive.DVFS}, they predicated both the energy
-and the performance for each frequency gear, then the algorithm selected the best gear that gave
-the best tradeoff. In contrast our algorithm works over a heterogeneous platform composed of
-four different types of processors. Others approaches such as
-\cite{Shelepov_Scheduling.on.Heterogeneous.Multicore} and \cite{Li_Minimizing.Energy.Consumption.for.Frame.Based.Tasks},
-they are selected the best frequencies for a specified heterogeneous clusters offline using some
-heuristic methods. While our proposed algorithm works online during the execution time of
-iterative application. Greedy dynamic approach used by Chen et al.~\cite{Chen_DVFS.under.quality.of.service.requirements},
-minimized the power consumption of a heterogeneous severs with time/space complexity, this approach
-had considerable overhead. In our proposed scaling algorithm has very small overhead and
-it is works without any previous analysis for the application time complexity. The primary
-contributions of our paper are :
-\begin{enumerate}
-\item It is presents a new online heterogeneous scaling algorithm which has very small
- overhead and not need for any training and profiling.
-\item It is develops a new energy model for iterative distributed applications running over
- a heterogeneous clusters, taking into account the communication and slack times.
-\item The proposed scaling algorithm predicts both the energy and the execution time
- of the iterative application.
-\item It demonstrates a new optimization function which maximize the performance and
- minimize the energy consumption simultaneously.
-
-\end{enumerate}