X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/hpcc2014.git/blobdiff_plain/5b2dbb0b083f40235d97c8e8ee2b5834dd2096cc..b9756213bfa3fbf7c5667385f5070b003b9bf0b3:/hpcc.tex?ds=sidebyside diff --git a/hpcc.tex b/hpcc.tex index 0dccef3..abcf399 100644 --- a/hpcc.tex +++ b/hpcc.tex @@ -45,14 +45,14 @@ \begin{document} -\title{Simulation of Asynchronous Iterative Numerical Algorithms Using SimGrid} +\title{Simulation of Asynchronous Iterative Algorithms Using SimGrid} \author{% \IEEEauthorblockN{% Charles Emile Ramamonjisoa\IEEEauthorrefmark{1}, + Lilia Ziane Khodja\IEEEauthorrefmark{2}, David Laiymani\IEEEauthorrefmark{1}, - Arnaud Giersch\IEEEauthorrefmark{1}, - Lilia Ziane Khodja\IEEEauthorrefmark{2} and + Arnaud Giersch\IEEEauthorrefmark{1} and Raphaël Couturier\IEEEauthorrefmark{1} } \IEEEauthorblockA{\IEEEauthorrefmark{1}% @@ -71,34 +71,20 @@ \maketitle -\RC{Ordre des auteurs pas définitif.} \begin{abstract} -\AG{L'abstract est AMHA incompréhensible et ne donne pas envie de lire la suite.} -In recent years, the scalability of large-scale implementation in a -distributed environment of algorithms becoming more and more complex has -always been hampered by the limits of physical computing resources -capacity. One solution is to run the program in a virtual environment -simulating a real interconnected computers architecture. The results are -convincing and useful solutions are obtained with far fewer resources -than in a real platform. However, challenges remain for the convergence -and efficiency of a class of algorithms that concern us here, namely -numerical parallel iterative algorithms executed in asynchronous mode, -especially in a large scale level. Actually, such algorithm requires a -balance and a compromise between computation and communication time -during the execution. Two important factors determine the success of the -experimentation: the convergence of the iterative algorithm on a large -scale and the execution time reduction in asynchronous mode. Once again, -from the current work, a simulated environment like SimGrid provides -accurate results which are difficult or even impossible to obtain in a -physical platform by exploiting the flexibility of the simulator on the -computing units clusters and the network structure design. Our -experimental outputs showed a saving of up to \np[\%]{40} for the algorithm -execution time in asynchronous mode compared to the synchronous one with -a residual precision up to \np{E-11}. Such successful results open -perspectives on experimentations for running the algorithm on a -simulated large scale growing environment and with larger problem size. - -\LZK{Long\ldots} + +Synchronous iterative algorithms are often less scalable than asynchronous +iterative ones. Performing large scale experiments with different kind of +network parameters is not easy because with supercomputers such parameters are +fixed. So one solution consists in using simulations first in order to analyze +what parameters could influence or not the behaviors of an algorithm. In this +paper, we show that it is interesting to use SimGrid to simulate the behaviors +of asynchronous iterative algorithms. For that, we compare the behaviour of a +synchronous GMRES algorithm with an asynchronous multisplitting one with +simulations in which we choose some parameters. Both codes are real MPI +codes. Experiments allow us to see when the multisplitting algorithm can be more +efficient than the GMRES one to solve a 3D Poisson problem. + % no keywords for IEEE conferences % Keywords: Algorithm distributed iterative asynchronous simulation SimGrid @@ -111,7 +97,7 @@ problems raised by researchers on various scientific disciplines but also by in increasing complexity of these requested applications combined with a continuous increase of their sizes lead to write distributed and parallel algorithms requiring significant hardware resources (grid computing, clusters, broadband network, etc.) but also a non-negligible CPU execution time. We consider in this paper a class of highly efficient -parallel algorithms called \emph{numerical iterative algorithms} executed in a distributed environment. As their name +parallel algorithms called \emph{iterative algorithms} executed in a distributed environment. As their name suggests, these algorithms solve a given problem by successive iterations ($X_{n +1} = f(X_{n})$) from an initial value $X_{0}$ to find an approximate value $X^*$ of the solution with a very low residual error. Several well-known methods demonstrate the convergence of these algorithms~\cite{BT89,Bahi07}. @@ -127,34 +113,34 @@ at that time. Even if the number of iterations required before the convergence i synchronous case, AIAC algorithms can significantly reduce overall execution times by suppressing idle times due to synchronizations especially in a grid computing context (see~\cite{Bahi07} for more details). -Parallel numerical applications (synchronous or asynchronous) may have different -configuration and deployment requirements. Quantifying their resource -allocation policies and application scheduling algorithms in grid computing -environments under varying load, CPU power and network speeds is very costly, -very labor intensive and very time -consuming~\cite{Calheiros:2011:CTM:1951445.1951450}. The case of AIAC -algorithms is even more problematic since they are very sensible to the +Parallel (synchronous or asynchronous) applications may have different +configuration and deployment requirements. Quantifying their resource +allocation policies and application scheduling algorithms in grid computing +environments under varying load, CPU power and network speeds is very costly, +very labor intensive and very time +consuming~\cite{Calheiros:2011:CTM:1951445.1951450}. The case of AIAC +algorithms is even more problematic since they are very sensible to the execution environment context. For instance, variations in the network bandwidth -(intra and inter-clusters), in the number and the power of nodes, in the number -of clusters\dots{} can lead to very different number of iterations and so to -very different execution times. Then, it appears that the use of simulation -tools to explore various platform scenarios and to run large numbers of -experiments quickly can be very promising. In this way, the use of a simulation -environment to execute parallel iterative algorithms found some interests in -reducing the highly cost of access to computing resources: (1) for the -applications development life cycle and in code debugging (2) and in production -to get results in a reasonable execution time with a simulated infrastructure -not accessible with physical resources. Indeed, the launch of distributed -iterative asynchronous algorithms to solve a given problem on a large-scale -simulated environment challenges to find optimal configurations giving the best +(intra and inter-clusters), in the number and the power of nodes, in the number +of clusters\dots{} can lead to very different number of iterations and so to +very different execution times. Then, it appears that the use of simulation +tools to explore various platform scenarios and to run large numbers of +experiments quickly can be very promising. In this way, the use of a simulation +environment to execute parallel iterative algorithms found some interests in +reducing the highly cost of access to computing resources: (1) for the +applications development life cycle and in code debugging (2) and in production +to get results in a reasonable execution time with a simulated infrastructure +not accessible with physical resources. Indeed, the launch of distributed +iterative asynchronous algorithms to solve a given problem on a large-scale +simulated environment challenges to find optimal configurations giving the best results with a lowest residual error and in the best of execution time. To our knowledge, there is no existing work on the large-scale simulation of a -real AIAC application. The aim of this paper is twofold. First we give a first +real AIAC application. There are {\bf two contributions} in this paper. First we give a first approach of the simulation of AIAC algorithms using a simulation tool (i.e. the -SimGrid toolkit~\cite{SimGrid}). Second, we confirm the effectiveness of -asynchronous mode algorithms by comparing their performance with the synchronous -mode. More precisely, we had implemented a program for solving large +SimGrid toolkit~\cite{SimGrid}). Second, we confirm the effectiveness of the +asynchronous multisplitting algorithm by comparing its performance with the synchronous +GMRES. More precisely, we had implemented a program for solving large linear system of equations by numerical method GMRES (Generalized Minimal Residual) \cite{ref1}. We show, that with minor modifications of the initial MPI code, the SimGrid toolkit allows us to perform a test campaign of a