X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/hpcc2014.git/blobdiff_plain/4842d834e57f085b1a7e7afb2546af0b3ad652fb..70f82cf5e87fbbebce020a9163579a602385a7ff:/hpcc.tex?ds=sidebyside diff --git a/hpcc.tex b/hpcc.tex index 47d8102..a5f2768 100644 --- a/hpcc.tex +++ b/hpcc.tex @@ -45,26 +45,59 @@ \author{% \IEEEauthorblockN{% - Charles Emile Ramamonjisoa and - David Laiymani and - Arnaud Giersch and - Lilia Ziane Khodja and - Raphaël Couturier + Charles Emile Ramamonjisoa\IEEEauthorrefmark{1}, + David Laiymani\IEEEauthorrefmark{1}, + Arnaud Giersch\IEEEauthorrefmark{1}, + Lilia Ziane Khodja\IEEEauthorrefmark{2} and + Raphaël Couturier\IEEEauthorrefmark{1} } - \IEEEauthorblockA{% - Femto-ST Institute - DISC Department\\ - Université de Franche-Comté\\ - Belfort\\ - Email: \email{{raphael.couturier,arnaud.giersch,david.laiymani,charles.ramamonjisoa}@univ-fcomte.fr} + \IEEEauthorblockA{\IEEEauthorrefmark{1}% + Femto-ST Institute -- DISC Department\\ + Université de Franche-Comté, + IUT de Belfort-Montbéliard\\ + 19 avenue du Maréchal Juin, BP 527, 90016 Belfort cedex, France\\ + Email: \email{{charles.ramamonjisoa,david.laiymani,arnaud.giersch,raphael.couturier}@univ-fcomte.fr} + } + \IEEEauthorblockA{\IEEEauthorrefmark{2}% + Inria Bordeaux Sud-Ouest\\ + 200 avenue de la Vieille Tour, 33405 Talence cedex, France \\ + Email: \email{lilia.ziane@inria.fr} } } \maketitle \RC{Ordre des autheurs pas définitif.} -\LZK{Adresse de Lilia: Inria Bordeaux Sud-Ouest, 200 Avenue de la Vieille Tour, 33405 Talence Cedex, France \\ Email: lilia.ziane@inria.fr} \begin{abstract} -The abstract goes here. +ABSTRACT + +In recent years, the scalability of large-scale implementation in a +distributed environment of algorithms becoming more and more complex has +always been hampered by the limits of physical computing resources +capacity. One solution is to run the program in a virtual environment +simulating a real interconnected computers architecture. The results are +convincing and useful solutions are obtained with far fewer resources +than in a real platform. However, challenges remain for the convergence +and efficiency of a class of algorithms that concern us here, namely +numerical parallel iterative algorithms executed in asynchronous mode, +especially in a large scale level. Actually, such algorithm requires a +balance and a compromise between computation and communication time +during the execution. Two important factors determine the success of the +experimentation: the convergence of the iterative algorithm on a large +scale and the execution time reduction in asynchronous mode. Once again, +from the current work, a simulated environment like Simgrid provides +accurate results which are difficult or even impossible to obtain in a +physical platform by exploiting the flexibility of the simulator on the +computing units clusters and the network structure design. Our +experimental outputs showed a saving of up to 40 \% for the algorithm +execution time in asynchronous mode compared to the synchronous one with +a residual precision up to E-11. Such successful results open +perspectives on experimentations for running the algorithm on a +simulated large scale growing environment and with larger problem size. + +Keywords : Algorithm distributed iterative asynchronous simulation +simgrid + \end{abstract} \section{Introduction} @@ -171,7 +204,7 @@ B_1 \\ \vdots\\ B_L \end{array} \right)\] -in such a way that successive rows of matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster, where for all $l,m\in\{1,\ldots,L\}$ $A_{lm}$ is a rectangular block of $A$ of size $n_l\times n_m$, $X_l$ and $B_l$ are sub-vectors of $x$ and $b$, respectively, each of size $n_l$ and $\sum_{l} n_l=\sum_{m} n_m=n$. +in such a way that successive rows of matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster, where for all $l,m\in\{1,\ldots,L\}$ $A_{lm}$ is a rectangular block of $A$ of size $n_l\times n_m$, $X_l$ and $B_l$ are sub-vectors of $x$ and $b$, respectively, of size $n_l$ each and $\sum_{l} n_l=\sum_{m} n_m=n$. The multisplitting method proceeds by iteration to solve in parallel the linear system on $L$ clusters of processors, in such a way each sub-system \begin{equation} @@ -349,11 +382,41 @@ achieved with an inter cluster of \np[Mbits/s]{10} and a latency of \np{E-1} ms. challenge an efficiency by \np[\%]{78} with a matrix size of 100 points, it was necessary to degrade the inter cluster network bandwidth from 5 to 2 Mbit/s. -A last attempt was made for a configuration of three clusters but more power +A last attempt was made for a configuration of three clusters but more powerful with 200 nodes in total. The convergence with a speedup of \np[\%]{90} was obtained with a bandwidth of \np[Mbits/s]{1} as shown in Table~\ref{tab.cluster.3x67}. \section{Conclusion} +CONCLUSION + +The experimental results on executing a parallel iterative algorithm in +asynchronous mode on an environment simulating a large scale of virtual +computers organized with interconnected clusters have been presented. +Our work has demonstrated that using such a simulation tool allow us to +reach the following three objectives: + +\newcounter{numberedCntD} +\begin{enumerate} +\item To have a flexible configurable execution platform resolving the +hard exercise to access to very limited but so solicited physical +resources; +\item to ensure the algorithm convergence with a raisonnable time and +iteration number ; +\item and finally and more importantly, to find the correct combination +of the cluster and network specifications permitting to save time in +executing the algorithm in asynchronous mode. +\setcounter{numberedCntD}{\theenumi} +\end{enumerate} +Our results have shown that in certain conditions, asynchronous mode is +speeder up to 40 \% than executing the algorithm in synchronous mode +which is not negligible for solving complex practical problems with more +and more increasing size. + + Several studies have already addressed the performance execution time of +this class of algorithm. The work presented in this paper has +demonstrated an original solution to optimize the use of a simulation +tool to run efficiently an iterative parallel algorithm in asynchronous +mode in a grid architecture. \section*{Acknowledgment}