X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/mpi-energy2.git/blobdiff_plain/c9875e5d70672da61c5cdc75fac90dad56cb1b04..dc41bd6acde81b12c7e48075d1a74dee1f30db73:/Heter_paper.tex diff --git a/Heter_paper.tex b/Heter_paper.tex index 2fc8549..a6d44b1 100644 --- a/Heter_paper.tex +++ b/Heter_paper.tex @@ -76,7 +76,19 @@ \maketitle \begin{abstract} - +Green computing emphasizes the importance of energy conservation, minimizing the negative impact +on the environment while achieving high performance and minimizing operating costs. So, energy reduction +process in a high performance clusters it can be archived using dynamic voltage and frequency +scaling (DVFS) technique, through reducing the frequency of a CPU. Using DVFS to lower levels +result in a high increase in performance degradation ratio. Therefore selecting the best frequencies +must give the best possible tradeoff between the energy and the performance of parallel program. + +In this paper we present a new online heterogeneous scaling algorithm that selects the best vector +of frequency scaling factors. These factors give the best tradeoff between the energy saving and the +performance degradation. The algorithm has small overhead and works without training and profiling. +We developed a new energy model for distributed iterative application running on heterogeneous cluster. +The proposed algorithm experimented on Simgrid simulator that applying the NAS parallel benchmarks. +It reduces the energy consumption up to 35\% while limits the performance degradation as much as possible. \end{abstract} \section{Introduction} @@ -153,9 +165,9 @@ on CPUs. As an example of this works, Luley et al. cluster composed of Intel Xeon CPUs and NVIDIA GPUs. Their main goal is to determined the energy efficiency as a function of performance per watt, the best tradeoff is done when the performance per watt function is maximized. In the work of Kia Ma et al. -~\cite{KaiMa_Holistic.Approach.to.Energy.Efficiency.in.GPU-CPU}, They developed a scheduling +~\cite{KaiMa_Holistic.Approach.to.Energy.Efficiency.in.GPU-CPU}, they developed a scheduling algorithm to distributed different workloads proportional to the computing power of the node -to be executed on a CPU or a GPU, emphasize all tasks must be finished in the same time. +to be executed on CPU or GPU, emphasize all tasks must be finished in the same time. Recently, Rong et al.~\cite{Rong_Effects.of.DVFS.on.K20.GPU}, Their study explain that a heterogeneous clusters enabled DVFS using GPUs and CPUs gave better energy and performance efficiency than other clusters composed of only CPUs. @@ -910,7 +922,7 @@ down the frequencies of some nodes have less effect on the performance. \subsection{The results for different power consumption scenarios} - +\label{sec.compare} The results of the previous section were obtained while using processors that consume during computation an overall power which is 80\% composed of dynamic power and 20\% of static power. In this section, these ratios are changed and two new power scenarios are considered in order to evaluate how the proposed @@ -1040,15 +1052,29 @@ for a heterogeneous cluster composed of four different types of nodes having the table~(\ref{table:platform}), it takes on average \np[ms]{0.04} for 4 nodes and \np[ms]{0.15} on average for 144 nodes to compute the best scaling factors vector. The algorithm complexity is $O(F\cdot (N \cdot4) )$, where $F$ is the number of iterations and $N$ is the number of computing nodes. The algorithm needs from 12 to 20 iterations to select the best -vector of frequency scaling factors that gives the results of the section (\ref{sec.res}). +vector of frequency scaling factors that gives the results of the sections (\ref{sec.res}) and (\ref{sec.compare}) . \section{Conclusion} \label{sec.concl} - +In this paper, we have presented a new online heterogeneous scaling algorithm +that selects the best possible vector of frequency scaling factors. This vector +gives the maximum distance (optimal tradeoff) between the normalized energy and +the performance curves. In addition, we developed a new energy model for measuring +and predicting the energy of distributed iterative applications running over heterogeneous +cluster. The proposed method evaluated on Simgrid/SMPI simulator to built a heterogeneous +platform to executes NAS parallel benchmarks. The results of the experiments showed the ability of +the proposed algorithm to changes its behaviour to selects different scaling factors when +the number of computing nodes and both of the static and the dynamic powers are changed. + +In the future, we plan to improve this method to apply on asynchronous iterative applications +where each task does not wait the others tasks to finish there works. This leads us to develop a new +energy model to an asynchronous iterative applications, where the number of iterations is not +known in advance and depends on the global convergence of the iterative system. \section*{Acknowledgment} + % trigger a \newpage just before the given reference % number - used to balance the columns on the last page % adjust value as needed - may need to be readjusted if