X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/mpi-energy2.git/blobdiff_plain/c0e417f74debcd978ae5229e414981fc3c45a19a..c911c7db174d7bc2cd2a2893deceb956fff0e02b:/Heter_paper.tex diff --git a/Heter_paper.tex b/Heter_paper.tex index 0cc9720..2f78db5 100644 --- a/Heter_paper.tex +++ b/Heter_paper.tex @@ -568,32 +568,52 @@ maximum distance between the energy curve and the performance curve is, which re \section{Experimental results} \label{sec.expe} +While in~\cite{mpi-energy2} the energy model and the scaling factors selection algorithm were applied to a heterogeneous cluster and evaluated over the SimGrid simulator~\cite{SimGrid.org}, +in this paper real experiments were conducted over the grid'5000 platform. \subsection{Grid'5000 architature and power consumption} \label{sec.grid5000} -The grid'5000 is a large-scale testbed found in France \cite{grid5000}. -The grid infrastructure consist of ten sites distributed over all France -metropolitan regions. Each site in the grid'5000 composed from number of heterogeneous -computing clusters, while each cluster includes a collection of homogeneous nodes. -In general, the grid'5000 had one thousand of heterogeneous nodes and eight thousand of cores. -All the sites are connected together via special long distance network called RENATER, -which is the French National Telecommunication Network for Technology. Whereas inside each site -the clusters and their nodes are connected throw high speed local area networks. -There are different types of local networks used such as Ethernet and Infiniband netwoks, -which allowed different gigabits bandwidth and latencies. On the other hand, the nodes inside each cluster -are homogeneous, while they are different from the nodes of the other clusters. Therefore, there are -a wide diversity of processors in grid'5000, that mainly had different processors families -such as Intel Xeon and AMD Opteron families. - -In this paper we are interested to run NAS parallel v3.3 \cite{NAS.Parallel.Benchmarks} over grid'5000. -We are used seven benchmarks, CG, MG, EP, LU, BT, SP and FT. These benchmarks used seven different types of classes. -These classes are S, W, A, B, C, D, E, where S represents the smaller problem size that used by benchmark and -E is represents the biggest class. In this work, the class D is used for all benchmarks in all the experiments that will -be showed in the coming sections. -Moreover, the NAS parallel benchmarks have different computations and communications ratios, then it is interested -to study their energy consumption and their performance on real testbed such as grid'5000. -In this work, the NAS benchmarks are executed over two sites, Lyon and Nancy sites, of grid'5000. -These two sites had seven different types of computing clusters as in figure (\ref{fig:grid5000}). +Grid'5000~\cite{grid5000} is a large-scale testbed that consists of ten sites distributed over all metropolitan France and Luxembourg. All the sites are connected together via a special long distance network called RENATER, +which is the French National Telecommunication Network for Technology. +Each site of the grid is composed of few heterogeneous +computing clusters and each cluster contains many homogeneous nodes. In total, + grid'5000 has about one thousand heterogeneous nodes and eight thousand cores. In each site, +the clusters and their nodes are connected via high speed local area networks. +Two types of local networks are used, Ethernet or Infiniband networks which have different characteristics in terms of bandwidth and latency. + +Since grid'5000 is dedicated for testing, contrary to production grids it allows a user to deploy its own customized operating system on all the booked nodes. The user could have root rights and thus apply DVFS operations while executing a distributed application. Moreover, the grid'5000 testbed provides at some sites a power measurement tool to capture +the power consumption for each node in those sites. The measured power is the overall consumed power by by all the components of a node at a given instant, such as CPU, hard drive, main-board, memory, ... For more details refer to +\cite{Energy_measurement}. To just measure the CPU power of one core in a node $j$, + firstly, the power consumed by the node while being idle at instant $y$, noted as $\Pidle[jy]$, was measured. Then, the power was measured while running a single thread benchmark with no communication (no idle time) over the same node with its CPU scaled to the maximum available frequency. The latter power measured at time $x$ with maximum frequency for one core of node $j$ is noted $P\max[jx]$. The difference between the two measured power consumption represents the +dynamic power consumption of that core with the maximum frequency, see figure(\ref{fig:power_cons}). + +\textcolor{red}{why maximum and minimum, change peak in the equation and the figure} + +The dynamic power $\Pd[j]$ is computed as in equation (\ref{eq:pdyn}) +\begin{equation} + \label{eq:pdyn} + \Pd[j] = \max_{x=\beta_1,\dots \beta_2} (P\max[jx]) - \min_{y=\Theta_1,\dots \Theta_2} (\Pidle[jy]) +\end{equation} + +where $\Pd[j]$ is the dynamic power consumption for one core of node $j$, +$\lbrace \beta_1,\beta_2 \rbrace$ is the time interval for the measured peak power values, +$\lbrace\Theta_1,\Theta_2\rbrace$ is the time interval for the measured idle power values. +Therefore, the dynamic power of one core is computed as the difference between the maximum +measured value in peak powers vector and the minimum measured value in the idle powers vector. + +On the other hand, the static power consumption by one core is a part of the measured idle power consumption of the node. Since in grid'5000 there is no way to measure precisely the consumed static power and in~\cite{Our_first_paper,pdsec2015,Rauber_Analytical.Modeling.for.Energy} it was assumed that the static power represents a ratio of the dynamic power, the value of the static power is assumed as np[\%]{20} of dynamic power consumption of the core. + +In the experiments presented in the following sections, two sites of grid'5000 were used, Lyon and Nancy sites. These two sites have in total seven different clusters as in figure (\ref{fig:grid5000}). + +Four clusters from the two sites were selected in the experiments: one cluster from +Lyon's site, Taurus cluster, and three clusters from Nancy's site, Graphene, +Griffon and Graphite. Each one of these clusters has homogeneous nodes inside, while nodes from different clusters are heterogeneous in many aspects such as: computing power, power consumption, available +frequency ranges and local network features: the bandwidth and the latency. Table \ref{table:grid5000} shows +the details characteristics of these four clusters. Moreover, the dynamic powers were computed using the equation (\ref{eq:pdyn}) for all the nodes in the +selected clusters and are presented in table \ref{table:grid5000}. + + + \begin{figure}[!t] \centering @@ -602,12 +622,22 @@ These two sites had seven different types of computing clusters as in figure (\r \label{fig:grid5000} \end{figure} -Four clusters from the two sites are selected in the experiments, one cluster from -Lyon site, Taurus cluster, and three clusters from Nancy site where are Graphene, -Griffon and Graphite. Each one of these clusters has homogeneous nodes inside, while their nodes are -different from the nodes of other clusters in many aspects such as: computing power, power consumption, available -frequencies ranges and the network features, the bandwidth and the latency. The Table \ref{table:grid5000} shows -the details characteristics of these four clusters. + +The energy model and the scaling factors selection algorithm were applied to the NAS parallel benchmarks v3.3 \cite{NAS.Parallel.Benchmarks} and evaluated over grid'5000. +The benchmark suite contains seven applications: CG, MG, EP, LU, BT, SP and FT. These applications have different computations and communications ratios and strategies which make them good testbed applications to evaluate the proposed algorithm and energy model. +The benchmarks have seven different classes, S, W, A, B, C, D and E, that represent the size of the problem that the method solves. In this work, the class D was used for all benchmarks in all the experiments presented in the next sections. + + + + +\begin{figure}[!t] + \centering + \includegraphics[scale=0.6]{fig/power_consumption.pdf} + \caption{The power consumption by one core from Taurus cluster} + \label{fig:power_cons} +\end{figure} + + \begin{table}[!t] @@ -640,44 +670,7 @@ the details characteristics of these four clusters. \label{table:grid5000} \end{table} -The grid'5000 testbed provided some monitoring and measurements features to captured -the power consumption values for each node in any cluster of Lyon and Nancy sites. -The power consumed for each node from the selected four clusters is measured. -While the power consumed by any computing node is a collection of the powers consumed by -hard drive, main-board, memory and node's computing cores, for more detail refer to -\cite{Energy_measurement}. Therefore, the dynamic power consumed -by one core is not allowed to measured alone. To overcome this problem, firstly, -we measured the power consumed by one node when there is no computation, when -the CPU is in the idle state. The second step, we run EP benchmark, there is no communications -in this benchmarks, over one core with maximum frequency of the desired node and -capturing the power consumed by a node, this representing the peak power of the node with one core. -The difference between the peak power and the idle power representing the -dynamic power consumption of that core with maximum frequency, for example see figure(\ref{fig:power_cons}). -The $\Ppeak[jx]$ is the peak power value in time $x$ with maximum frequency for one core of node $j$, -and $\Pidle[jy]$ is the idle power value in time $y$ for the one core of the node $j$ . -The dynamic power $\Pd[j]$ is computed as in equation (\ref{eq:pdyn}) -\begin{equation} - \label{eq:pdyn} - \Pd[j] = \max_{x=\beta_1,\dots \beta_2} (\Ppeak[jx]) - \min_{y=\Theta_1,\dots \Theta_2} (\Pidle[jy]) -\end{equation} -where $\Pd[j]$ is the dynamic power consumption for one core of node $j$, -$\lbrace \beta_1,\beta_2 \rbrace$ is the time interval for the measured peak power values, -$\lbrace\Theta_1,\Theta_2\rbrace$ is the time interval for the measured idle power values. -Therefore, the dynamic power of one core is computed as the difference between the maximum -measured value in peak powers vector and the minimum measured value in the idle powers vector. -We are computed the dynamic powers, using the equation (\ref{eq:pdyn}), for all nodes in the -selected clusters, which is recorded in table \ref{table:grid5000}. -On the other side, the static power consumption by one core is embedded with whole idle power consumption of the node. -Indeed, the static power is represents as ratio from dynamic power. So, we supposed -the static power consumption represented as \np[\%]{20} of dynamic power consumption of the core, -the same assumption was made in \cite{Our_first_paper,pdsec2015,Rauber_Analytical.Modeling.for.Energy}. -\begin{figure}[!t] - \centering - \includegraphics[scale=0.6]{fig/power_consumption.pdf} - \caption{The power consumption by one core from Taurus cluster} - \label{fig:power_cons} -\end{figure} \subsection{The experimental results of the scaling algorithm}