From ddb7f4a3a78093372d3bfe5576b564c59f64488d Mon Sep 17 00:00:00 2001 From: afanfakh Date: Thu, 24 Mar 2016 12:30:46 +0100 Subject: [PATCH 1/1] adding the first chapter --- CHAPITRE_01.tex | 34 ++++++++++++++++++---------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/CHAPITRE_01.tex b/CHAPITRE_01.tex index 92d38d9..38344d7 100644 --- a/CHAPITRE_01.tex +++ b/CHAPITRE_01.tex @@ -14,8 +14,9 @@ Almost of the software applications are traditionally programmed as a sequential programs according to the Von Neumann report in 1993 \cite{ref50}. The structure of the program code is understandable by the human brain as a series of instructions that execute one after the other. From many years until a short time, the users of the sequential applications are moving their thinking towards that these applications must run faster with each new generation of microprocessors. This idea is no longer valid nowadays, because the recent release of the microprocessors have many computing units embedded in one chip and these programs are only run over one computing unit sequentially. Consequently, the traditional applications not have improved their performance a lot over the new architectures, whereas the new applications run faster over them in a parallel. The parallel application is executed over all the available computing units at the same time to improve its performance. Furthermore, the concurrency revolution has been referred to the drastically improvement in the performance of the new applications side by side to the new parallel architectures \cite{ref51}. Therefore, parallel applications and parallel architectures are closely tied together. It is hard to think about any of a parallel applications without thinking of the parallel hardware that executing them. +For example, the energy consumption of the parallel system mainly depends on both of the parallel application and the parallel architecture executing this application. Indeed, the energy consumption model or any measurement system depends on many specifications, some of them are concerting the parallel platform such as the frequency of the processor, power consumption of the processor and communication model. The others are concerting the parallel application such as the computation and communication times of the application. -In this work, the iterative parallel applications, which is the most popular type of the parallel applications, are interested and running them over different parallel architectures to optimize their energy consumptions is the goal. +In this work, the iterative parallel applications, which is the most popular type of the parallel applications, are interested and running them over different parallel architectures to optimize their energy consumptions is the main goal. As a result, this chapter is aimed to give a brief overview for a parallel hardware architectures, parallel iterative applications and the energy model from the other authors used to measure the energy consumption of these applications. The reminder of this chapter is organized as follows: section \ref{ch1:2} is devoted to describe the types of parallelism and the types of the parallel platforms. It is also gives some information about the parallel programming models. Section \ref{ch1:3} explains both the synchronous and asynchronous parallel iterative methods and comparing them. Section \ref{ch1:4}, presents the well accepted energy model from the state of the art that can be used to measure the energy consumption of the parallel iterative applications when changing the frequency of the processor. Finally, section \ref{ch1:5} summaries this chapter. @@ -26,7 +27,7 @@ to describe the types of parallelism and the types of the parallel platforms. I The process of the simultaneous application of the calculations is called the parallel computing. It has main principle refer to the ability of dividing the large problem into smaller sub-problems that can be solved at the same time \cite{ref2}. Mainly, solving the sub-problems of the main problem in a parallel computing are carried out on multiple parallel processors. -Indeed, the parallel processors architecture is a computer system composed from many processing elements connected via network model in addition to the software tools required to make the processing units work together \cite{ref1}. +Indeed, the parallel processors architecture is a computer system composed of many processing elements connected via network model in addition to the software tools required to make the processing units work together \cite{ref1}. Consequently, parallel computing architecture consist of software and hardware resources. The hardware resources are the processing units and the memory model in addition to the network system connecting them. The software resources include the specific operating system, the programming language and the compiler, or the runtime libraries. Furthermore, parallel computing can have different levels of parallelism, which can perform in software or hardware. There are five types of parallelism as follows: \begin{itemize} @@ -239,13 +240,13 @@ Grid'5000 is dedicated as a test-bed for grid computing and thus users can book \subsection{Parallel programming Models} -\label{ch1:2:2}. +\label{ch1:2:2} There are many parallel programming languages and libraries have been developed to explore the computing power of the parallel architectures. In this section, -the parallel computing programming languages are divided into two main types, -which is the shared and the distributed models. Moreover, these two types are divided into two subcategories according to the support level to the number of computing units composing them. +the parallel programming languages are divided into two main types, +which is the shared and the distributed programming models. Moreover, these two types are divided into two subcategories according to the support level for the number of computing units composing them. Figure \ref{fig:ch1:14} presents this classification hierarchy of the parallel programming -paradigm. It is also show three parallel languages examples for each sub-category. +models. It is also show three parallel languages examples for each subcategory. \begin{figure}[h!] @@ -263,7 +264,8 @@ some examples for each type of the parallel programming models: \begin{itemize} \item \textbf{Local cluster programming models} \begin{itemize} - \item \textbf{MPI} \cite{ref23} is the Message Passing Interface, is a standardization + \item \textbf{MPI} \cite{ref23} is the Message Passing Interface and it considers a + standardization dedicated for message passing in distributed memory environment. The first version of MPI designated by a group of researchers in 1991. It is a library, not a language and its subroutines @@ -273,7 +275,7 @@ some examples for each type of the parallel programming models: Its library functions are not only for peer to peer operations throw send and receive messages, but it allowed many others collective operations such as gathering and reduction operations. MPI user feel - free form the network topology, synchronization, and communication + free form the network topology, synchronization and communication functionality between group of processes. Furthermore, it has asynchronous point to point operations, which make the computations to overlap with communications. While MPI is not devoted to a grid, @@ -365,7 +367,7 @@ some examples for each type of the parallel programming models: The difference between OpenMP and TBB, is the latter uses a task-based scheduling mechanism. Furthermore, TBB is more popular with C++ programming language than others languages. It is designed to work with any compiler environments, and thus - it easily ported to a new platform. Consequently, TBB has been ported to a + it is easily ported to a new platform. Consequently, TBB has been ported to a different types of operating systems and processors. While, it has limited support to vector processing architecture and then it connected with OpenMP and Cilk to support this platform. @@ -380,7 +382,7 @@ some examples for each type of the parallel programming models: of core. According to this massively cores parallelism, the NVIDIA in 2007 developed a parallel programming language called CUDA , which is for Compute Unified Device Architecture. A CUDA program has two parts, the first one is called a host which is a - set of threads that executed sequentially over the CPU. The second part is called the + set of threads that execute sequentially over the CPU. The second part is called the kernels, which are a set of a threads that can be executed in a parallel over the GPU. \item \textbf{OpenCL}\cite{ref38} is for Open Computing Language. It is a parallel @@ -395,7 +397,7 @@ some examples for each type of the parallel programming models: \item \textbf{HLSL} \cite{ref39} is for High Level Shading Language, is the shader programming language for Direct3D, which is a part of Microsoft’s DirectX API. It supports the shader construction with - C-like syntax, types, expressions, statements, and functions. It + C-like syntax, types, expressions, statements, and functions and it provides a graphical pipeline parallelism. The last version of HLSL is v5.0 for DirectX 11, which adds a new general-purpose GPU functions like CUDA. Recently, the new OpenCL version starts to replace CUDA @@ -769,9 +771,9 @@ take into account the communication times in addition to computation times to mo \section{Conclusion} \label{ch1:5} -In this chapter, we have presented in general different types of parallelism levels that can be implemented in a software and hardware techniques. Furthermore, the types of the parallel architectures are demonstrated and classified according to how the computing units are connected to a memory model. -The two parallel systems are described, which are the shared and distributed platforms. Depending on these two types, we have categorized the parallel programming models. The parallel iterative methods are explained and their two types, the synchronous and asynchronous iterative methods, are described. The synchronous iterative methods are well implemented over local homogeneous cluster with a high speed network link, while the asynchronous iterative methods are more conventional to implement over the distributed heterogeneous clusters. -Consequently, running these two types of the parallel iterative methods over distributed platforms are interested in this work. The energy consumption model for measuring the energy consumption of the parallel applications from the related literature is described. This model cannot be used for all types of parallel architectures. It is assumed to measure the dynamic power during both communication and computation times, while the processor involved remains idle during the communication times and only consumes the static power. Moreover, it is not well adapted to the heterogeneous architectures when there are different -types of the processors, which are consumed different dynamic and static powers. +In this chapter, we have presented different types of parallelism levels that can be implemented in software and hardware techniques. Furthermore, the types of the parallel architectures are demonstrated and classified according to how the computing units are connected to a memory model. +Both of the shared and distributed platforms are demonstrated and depending on them we have categorized the parallel programming models. +The two types of parallel iterative methods, the synchronous and asynchronous iterative methods, are described. The synchronous iterative methods are well implemented over local homogeneous cluster with a high speed network link, while the asynchronous iterative methods are more conventional to implement over the distributed heterogeneous clusters. +The energy optimization of running these two types of the parallel iterative methods over distributed platforms is the objective of this work. Consequently, the energy consumption model used for measuring the energy consumption of the parallel applications from the related literature is described. This model cannot be used for all types of parallel architectures. Indeed, it assumes measuring the dynamic power during both communication and computation times, while the processor involved remains idle during the communication times and only consumes the static power. Moreover, it is not well adapted to the heterogeneous architectures when there are different types of the processors, which are consumed different dynamic and static powers. -However, in the coming chapters of this thesis a new energy consumption models are developed, use for modeling and measuring the energies consumed by a parallel iterative methods running on both homogeneous and heterogeneous architectures. +However, in the next chapters of this thesis a new energy consumption models are developed, which they use for modeling and measuring the energy consumptions by a parallel iterative methods running on both homogeneous and heterogeneous architectures. -- 2.39.5