- \textcolor{red}{
-The proposed scaling algorithm selecting smaller frequencies in two sites scenario,
-due to decreasing in the computations to communications ratio when the number of nodes is increased and
-leads to less performance degradation percentage.
-In contrast, the performance degradation percentage for the benchmarks running on one site with
-16 or 32 nodes is on average equal to 3\% or 10\% respectively.
-The inverse is happens in this scenario when the number of computing nodes is increased
-the performance degradation percentage is decreased. So, using double number of computing
-nodes when the communications occur in high speed network not decreased the computations to
-communication ratio. Moreover, as shown in the figure \ref{fig:time_sen}, the execution time of one site scenario with 32 nodes
-are less by approximately double, linear speed-up, for most of the benchmarks comparing to the one site with 16 nodes scenario.
-This leads to increased the number of the critical nodes which any one of them may increased the overall the execution time of the benchmarks.
-The EP benchmarks is gives the bigger performance degradation ratio, because there is no
-communications and no slack times in this benchmarks which their performance govern
-The tradeoff between these scenarios can be computed as in the tradeoff function \ref{eq:max}.
-Figure \ref{fig:dist}, presents the tradeoff distance for all benchmarks over all
-platform scenarios. The one site scenario with 16 and 32 nodes had the best tradeoff distance
-compared to the two sites scenarios, due to the increase or decreased in the communications as mentioned before.
-The one site scenario with 16 nodes is the best scenario in term of energy and performance tradeoff,
-which on average is up 26\%. Therefore, the tradeoff distance is related linearly to the energy saving
-percentage. Finally, the best energy and performance tradeoff depends on the all of the following:
-1) the computations to communications ratio when there is a communications and slack times, 2) the differences in computing powers
-between the computing nodes and 3) the differences in static and the dynamic powers of the nodes.}
-\subsection{The experimental results of multicores clusters}
-\label{sec.res-mc}
-The grid'5000 clusters have different number of cores embedded in their nodes
-as in the Table \ref{table:grid5000}. Moreover, the cores of each node are
-connected via shared memory model, the data transfer between cores' local
-memories achieved via the global memory \cite{rauber_book}. Therefore, in
-this section the proposed scaling algorithm is implemented over the grid'5000
-clusters which are included multicores in the selected nodes as same as the
-two previous platform scenarios that mentioned in the section \ref{sec.res}.
-The two platform scenarios, the two sites and one site scenarios, with 32
-nodes are reconfigured to used multicores for each node. For example if
-the participating number of nodes from a certain cluster is equal to 12 nodes,
-in the multicores scenario the selected nodes is equal to 3 nodes with using
-4 cores for each of them to produced 12 cores. These scenarios with one
-core and multicores are demonstrated in Table \ref{table:sen-mc}.
-The energy consumptions and execution times of running the NAS parallel
-benchmarks, class D, over these four different scenarios are represented
-in the figures \ref{fig:eng-cons-mc} and \ref{fig:time-mc} respectively.
-The execution times of NAS benchmarks over the one site multicores scenario
-is higher than the execution time of those running over one site multicores scenario.
-The reason in the one site multicores scenario the communication is increased significantly,
-and all node's cores share the same node network link which increased
-the communication times. Whereas, the execution times of the NAS benchmarks over
-the two site multicores scenario is less than those executed over the two
-sites one core scenario. This goes back when using multicores is decreasing the communications.
-As explained previously, the cores shared same nodes' linkbut the communications between the cores
-are still less than the communication times between the nodes over the long distance
-networks, and thus the over all execution time decreased. Generally, executing
-the NAS benchmarks over the one site one core scenario gives smaller execution times
-comparing to other scenarios. This due to each node in this scenario has it's
-dedicated network link that used independently by one core, while in the other
-scenarios the communication times are higher when using long distance communications
-link or using the shared link communications between cores of each node.
-On the other hand, the energy consumptions of the NAS benchmarks over the
-one site one cores is less than the one site multicores scenario because
-this scenario had less execution time as mentioned before. Also, in the
-one site one core scenario the computations to communications ratio is
-higher, then the new scaled frequencies are decreased the dynamic energy
-consumption which is decreased exponentially
-with the new frequency scaling factors. These experiments also showed, the energy
-consumption and the execution times of EP and MG benchmarks over these four
-scenarios are not change a lot, because there are no or small communications
-which are increase or decrease the static power consumptions.
-The other benchmarks were showed that their energy consumptions and execution times
-are changed according to the decreasing or increasing in the communication
-times that are different from scenario to other or due to the amount of
-communications in each of them.
-
-The energy saving percentages of all NAS benchmarks, as in figure
-\ref{fig:eng-s-mc}, running over these four scenarios are presented. The figure
-showed the energy saving percentages of NAS benchmarks over two sites multicores scenario is higher
-than two sites once core scenario, because the computation
-times in this scenario is higher than the other one, then the more reduction in the
-dynamic energy can be obtained as mentioned previously. In contrast, in the one site one
-core and one site multicores scenarios the energy saving percentages
-are approximately equivalent, on average they are up to 25\%. In these both scenarios there are a small difference in the
-computations to communications ratio, leading the proposed scaling algorithm
-to selects the frequencies proportionally to these ratios and keeping
-as much as possible the energy saving percentages the same. The
-performance degradation percentages of NAS benchmarks are presented in
-figure \ref{fig:per-d-mc}. This figure indicates that performance
-degradation percentages of running NAS benchmarks over two sites
-multocores scenario, on average is equal to 7\%, gives more performance degradation percentage
-than two sites one core scenario, which on average is equal to 4\%.
-Moreover, using the two sites multicores scenario increased
-the computations to communications ratio, which may be increased
-the overall execution time when the proposed scaling algorithm is applied and scaling down the frequencies.
-The inverse was happened when the benchmarks are executed over one
-site one core scenario their performance degradation percentages, on average
-is equal to 10\%, are higher than those executed over one sit one core,
-which on average is equal to 7\%. So, in one site
-multicores scenario the computations to communications ratio is decreased
-as mentioned before, thus selecting new frequencies are not increased
-the overall execution time. The tradeoff distances of all NAS
-benchmarks over all scenarios are presented in the figure \ref{fig:dist-mc}.
-These tradeoff distances are used to verified which scenario is the best in term of
-energy and performance ratio. The one sites multicores scenario is the best scenario in term of
-energy and performance tradeoff, on average is equal to 17.6\%, when comparing to the one site one core
-scenario, one average is equal to 15.3\%. The one site multicores scenario
-has the same energy saving percentages of the one site one core scenario but
-with less performance degradation. The two sites multicores scenario is gives better
-energy and performance tradeoff, one average is equal to 14.7\%, than the two sites
-one core, on average is equal to 13.3\%.
-Finally, using multicore in both scenarios increased the energy and performance tradeoff
-distance. This generally due to using multicores was increased the computations to communications
-ratio in two sites scenario and thus the energy saving percentage increased over the performance degradation percentage, whereas this ratio was decreased
-in one site scenario causing the performance degradation percentage decreased over the energy saving percentage.
-
-
-
-
-
-\begin{table}[]
-\centering
-\caption{The multicores scenarios}
-
-\begin{tabular}{|*{4}{c|}}
-\hline
-Scenario name & Cluster name & \begin{tabular}[c]{@{}c@{}}No. of nodes\\ in each cluster\end{tabular} &
- \begin{tabular}[c]{@{}c@{}}No. of cores\\ for each node\end{tabular} \\ \hline
-\multirow{3}{*}{Two sites/ one core} & Taurus & 10 & 1 \\ \cline{2-4}
- & Graphene & 10 & 1 \\ \cline{2-4}
- & Griffon & 12 & 1 \\ \hline
-\multirow{3}{*}{Two sites/ multicores} & Taurus & 3 & 3 or 4 \\ \cline{2-4}
- & Graphene & 3 & 3 or 4 \\ \cline{2-4}
- & Griffon & 3 & 4 \\ \hline
-\multirow{3}{*}{One site/ one core} & Graphite & 4 & 1 \\ \cline{2-4}
- & Graphene & 12 & 1 \\ \cline{2-4}
- & Griffon & 12 & 1 \\ \hline
-\multirow{3}{*}{One site/ multicores} & Graphite & 3 & 3 or 4 \\ \cline{2-4}
- & Graphene & 3 & 3 or 4 \\ \cline{2-4}
- & Griffon & 3 & 4 \\ \hline
-\end{tabular}
-\label{table:sen-mc}
-\end{table}