+Energy reduction process for a high performance clusters recently performed using
+dynamic voltage and frequency scaling (DVFS) technique. DVFS is a technique enabled
+in a modern processors to scaled down both of the voltage and the frequency of
+the CPU while it is in the computing mode to reduce the energy consumption. DVFS is
+also allowed in the graphical processors GPUs, to achieved the same goal. Applying
+DVFS has a dramatical side effect if it is applied to minimum levels to gain more
+energy reduction, producing a high percentage of performance degradations for the
+parallel applications. Many researchers used different strategies to solve this
+nonlinear problem for example in~\cite{19,42}, their methods add big overheads to
+the algorithm to select the suitable frequency. In this paper we present a method
+to find the optimal set of frequency scaling factors for a heterogeneous cluster to
+simultaneously optimize both the energy and the execution time without adding a big
+overhead. This work is developed from our previous work of a homogeneous cluster~\cite{45}.
+Therefore we are interested to present some works that concerned the heterogeneous clusters
+enabled DVFS. In general, the heterogeneous cluster works fall into two categorizes:
+GPUs-CPUs heterogeneous clusters and CPUs-CPUs heterogeneous clusters. In GPUs-CPUs
+heterogeneous clusters some parallel tasks executed on a GPUs and the others executed
+on a CPUs. As an example of this works, Luley et al.~\cite{51}, proposed a heterogeneous
+cluster composed of Intel Xeon CPUs and NVIDIA GPUs. Their main goal is to determined the
+energy efficiency as a function of performance per watt, the best tradeoff is done when the
+performance per watt function is maximized. In the work of Kia Ma et al.~\cite{49},
+They developed a scheduling algorithm to distributed different workloads proportional
+to the computing power of the node to be executed on a CPU or a GPU, emphasize all tasks
+must be finished in the same time.
+Recently, Rong et al.~\cite{50}, Their study explain that a heterogeneous clusters enabled
+DVFS using GPUs and CPUs gave better energy and performance efficiency than other clusters
+composed of only CPUs. The CPUs-CPUs heterogeneous clusters consist of number of computing
+nodes all of the type CPU. Our work in this paper can be classified to this type of the
+clusters. As an example of this works see Naveen et al.~\cite{52} work, They developed a
+policy to dynamically assigned the frequency to a heterogeneous cluster. The goal is to
+minimizing a fixed metric of $energy*delay^2$. Where our proposed method is automatically
+optimized the relation between the energy and the delay of the iterative applications.
+Other works such as Lizhe et al.~\cite{53}, their algorithm divided the executed tasks into
+two types: the critical and non critical tasks. The algorithm scaled down the frequency of
+the non critical tasks as function to the amount of the slack and communication times that
+have with maximum of performance degradation percentage of 10\%. In our method there is no
+fixed bounds for performance degradation percentage and the bound is dynamically computed
+according to the energy and the performance tradeoff relation of the executed application.
+There are some approaches used a heterogeneous cluster composed from two different types
+of Intel and AMD processors such as~\cite{54} and \cite{55}, they predicated both the energy
+and the performance for each frequency gear, then the algorithm selected the best gear that gave
+the best tradeoff. In contrast our algorithm works over a heterogeneous platform composed of
+four different types of processors. Others approaches such as \cite{56} and \cite{57}, they
+are selected the best frequencies for a specified heterogeneous clusters offline using some
+heuristic methods. While our proposed algorithm works online during the execution time of
+iterative application. Greedy dynamic approach used by Chen et al.~\cite{58}, minimized
+the power consumption of a heterogeneous severs with time/space complexity, this approach
+had considerable overhead. In our proposed scaling algorithm has very small overhead and
+it is works without any previous analysis for the application time complexity.