\begin{itemize}
\item \textbf{Multi-core processors}:
The multi-core processor is a single chip component with two or more processing units.
-These processing units are called cores, which connected with each other via main memory model as in the figure \ref{fig:ch1:10}. Each individual core has its cache memory to store its data and execute different data or instructions stream in parallel. Moreover, each core can have one or more threads to execute a specific programming task as shown in the thread-level parallelism. Historically, the multi-cores of the CPU began as two-core processors, with increase in the number of cores approximately by double with each semiconductor process generation \cite{ref12}. The very quick improvements in the performance and thus the increase in the number of cores is devoted in the graphical processing unit (GPU). A current exemplar of GPU is the NVIDIA GeForce TITAN Z with 5700 cores in year of 2015 \cite{ref17}. While the general-purpose microprocessors (CPU) has less number of the cores, for example the TILE-MX processor from Tilera had 100 cores in the same year \cite{ref16}.
+These processing units are called cores, which connected with each other via main memory model as in the figure \ref{fig:ch1:10}. Each individual core has its cache memory to store its data and execute different data or instructions stream in parallel. Moreover, each core can have one or more threads to execute a specific programming task as shown in the thread-level parallelism. Historically, the multi-cores of the CPU began as two-core processors, with increase in the number of cores approximately by double with each semiconductor process generation \cite{ref12}. The very quick improvements in the performance and thus the increase in the number of cores is devoted in the graphical processing unit (GPU). A current exemplar of GPU is the NVIDIA GeForce TITAN Z with 5700 cores in year of 2015 \cite{ref17}. While the general-purpose microprocessors (CPU) has less number of the cores, for example the TILE-MX processor from Tilera has 100 cores in the same year \cite{ref16}.
For more details about the multi-core processors see \cite{ref15}.
\begin{figure}[h!]
Grid'5000 is dedicated as a test-bed for grid computing and thus users can book the required nodes from different sites. It allows the user to deploy his configured image of the operating system over the reserved nodes. Therefore, many software tools are available to the user to control and manage the reservation and deployment processes from his local machine. For example, OAR \cite{ref22} is a batch scheduler used to manage the heterogeneous resources of the grid'5000.
-
\subsection{Parallel programming Models}
\label{ch1:2:2}
There are many parallel programming languages and libraries have been developed
to explore the computing power of the parallel architectures. In this section,
the parallel programming languages are divided into two main types,
-which is the shared and the distributed programming models. Moreover, these two types are divided into two subcategories according to the support level for the number of computing units composing them.
+which is the shared and the distributed programming models. Moreover, each type is divided into two subcategories according to its support level for the number of computing units composing the parallel platform.
Figure \ref{fig:ch1:14} presents this classification hierarchy of the parallel programming
models. It is also showed three parallel languages examples for each subcategory.
\section{Conclusion}
\label{ch1:5}
In this chapter, three sections have been presented for describing the parallel hardware architectures, parallel iterative applications and the energy consumption model used to measure the energies of these applications.
-In the first section, different types of parallelism levels that can be implemented in a software and hardware techniques have explained. Furthermore, the types of the parallel architectures are demonstrated and classified according to how the computing units are connected to a memory model.
+In the first section, different types of parallelism levels that can be implemented in a software and hardware techniques have explained. Furthermore, the types of the parallel architectures are demonstrated and classified according to how their computing units are connected to a memory model.
Both of the shared and distributed platforms are demonstrated and depending on them the parallel programming models have categorized.
-In the second section, the two types parallel iterative methods are described as synchronous and asynchronous iterative methods. The synchronous iterative methods are well implemented over local homogeneous cluster with a high speed network link, while the asynchronous iterative methods are more conventional to implement over the distributed heterogeneous clusters.
-Finally in the third section, the energy consumption model used for measuring the energy consumption of the parallel applications from the related literature is described. This model cannot be used for all types of parallel architectures. Indeed, it assumes measuring the dynamic power during both of the communication and computation times, while the processor involved remains idle during the communication times and only consumes the static power. Moreover, it is not well adapted to heterogeneous architectures when there are different types of the processors, which are consumed different dynamic and static powers at the same time.
+In the second section, the two types of parallel iterative methods are described as synchronous and asynchronous iterative methods. The synchronous iterative methods are well implemented over local homogeneous cluster with a high speed network link, while the asynchronous iterative methods are more conventional to implement over the distributed heterogeneous clusters.
+Finally in the third section, an energy consumption model used for measuring the energy consumption of the parallel applications from the related literature has described. This model cannot be used for all types of parallel architectures. Indeed, it assumes measuring the dynamic power during both of the communication and computation times, while the processor involved remains idle during the communication times and only consumes the static power. Moreover, it is not well adapted to heterogeneous architectures when there are different types of the processors, which are consumed different dynamic and static powers at the same time.
However, in the next chapters of this thesis a new energy consumption models are developed, and how these
energy models are used for modeling and measuring the energy consumptions by parallel iterative methods running on both homogeneous and heterogeneous architectures. Furthermore, these energy models use in a methods for optimizing both of the energy consumption and the performance of the iterative message passing applications.
\ No newline at end of file