X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/hpcc2014.git/blobdiff_plain/8b34ea2b1bad6b4287c588f79dab03ee382b66a8..e421bb1806cde3dfe6a238e56119fbae3025fe9d:/hpcc.tex?ds=sidebyside diff --git a/hpcc.tex b/hpcc.tex index 207c560..94e9ee7 100644 --- a/hpcc.tex +++ b/hpcc.tex @@ -25,10 +25,12 @@ \usepackage[textsize=footnotesize]{todonotes} \newcommand{\AG}[2][inline]{% \todo[color=green!50,#1]{\sffamily\textbf{AG:} #2}\xspace} -\newcommand{\RC}[2][inline]{% - \todo[color=red!10,#1]{\sffamily\textbf{RC:} #2}\xspace} +\newcommand{\DL}[2][inline]{% + \todo[color=yellow!50,#1]{\sffamily\textbf{DL:} #2}\xspace} \newcommand{\LZK}[2][inline]{% \todo[color=blue!10,#1]{\sffamily\textbf{LZK:} #2}\xspace} +\newcommand{\RC}[2][inline]{% + \todo[color=red!10,#1]{\sffamily\textbf{RC:} #2}\xspace} \algnewcommand\algorithmicinput{\textbf{Input:}} \algnewcommand\Input{\item[\algorithmicinput]} @@ -45,24 +47,29 @@ \author{% \IEEEauthorblockN{% - Charles Emile Ramamonjisoa and - David Laiymani and - Arnaud Giersch and - Lilia Ziane Khodja and - Raphaël Couturier + Charles Emile Ramamonjisoa\IEEEauthorrefmark{1}, + David Laiymani\IEEEauthorrefmark{1}, + Arnaud Giersch\IEEEauthorrefmark{1}, + Lilia Ziane Khodja\IEEEauthorrefmark{2} and + Raphaël Couturier\IEEEauthorrefmark{1} + } + \IEEEauthorblockA{\IEEEauthorrefmark{1}% + Femto-ST Institute -- DISC Department\\ + Université de Franche-Comté, + IUT de Belfort-Montbéliard\\ + 19 avenue du Maréchal Juin, BP 527, 90016 Belfort cedex, France\\ + Email: \email{{charles.ramamonjisoa,david.laiymani,arnaud.giersch,raphael.couturier}@univ-fcomte.fr} } - \IEEEauthorblockA{% - Femto-ST Institute - DISC Department\\ - Université de Franche-Comté\\ - Belfort\\ - Email: \email{{raphael.couturier,arnaud.giersch,david.laiymani,charles.ramamonjisoa}@univ-fcomte.fr} + \IEEEauthorblockA{\IEEEauthorrefmark{2}% + Inria Bordeaux Sud-Ouest\\ + 200 avenue de la Vieille Tour, 33405 Talence cedex, France \\ + Email: \email{lilia.ziane@inria.fr} } } \maketitle \RC{Ordre des autheurs pas définitif.} -\LZK{Adresse de Lilia: Inria Bordeaux Sud-Ouest, 200 Avenue de la Vieille Tour, 33405 Talence Cedex, France \\ Email: lilia.ziane@inria.fr} \begin{abstract} ABSTRACT @@ -84,15 +91,14 @@ from the current work, a simulated environment like Simgrid provides accurate results which are difficult or even impossible to obtain in a physical platform by exploiting the flexibility of the simulator on the computing units clusters and the network structure design. Our -experimental outputs showed a saving of up to 40 \% for the algorithm +experimental outputs showed a saving of up to \np[\%]{40} for the algorithm execution time in asynchronous mode compared to the synchronous one with -a residual precision up to E-11. Such successful results open +a residual precision up to \np{E-11}. Such successful results open perspectives on experimentations for running the algorithm on a simulated large scale growing environment and with larger problem size. -Keywords : Algorithm distributed iterative asynchronous simulation -simgrid - +% no keywords for IEEE conferences +% Keywords: Algorithm distributed iterative asynchronous simulation simgrid \end{abstract} \section{Introduction} @@ -165,16 +171,11 @@ our future work after the results. \section{The asynchronous iteration model} -Décrire le modèle asynchrone. Je m'en charge (DL) +\DL{Décrire le modèle asynchrone. Je m'en charge} \section{SimGrid} -Décrire SimGrid~\cite{casanova+legrand+quinson.2008.simgrid} (Arnaud) - - - - - +\AG{Décrire SimGrid~\cite{casanova+legrand+quinson.2008.simgrid} (Arnaud)} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -199,7 +200,7 @@ B_1 \\ \vdots\\ B_L \end{array} \right)\] -in such a way that successive rows of matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster, where for all $l,m\in\{1,\ldots,L\}$ $A_{lm}$ is a rectangular block of $A$ of size $n_l\times n_m$, $X_l$ and $B_l$ are sub-vectors of $x$ and $b$, respectively, each of size $n_l$ and $\sum_{l} n_l=\sum_{m} n_m=n$. +in such a way that successive rows of matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster, where for all $l,m\in\{1,\ldots,L\}$ $A_{lm}$ is a rectangular block of $A$ of size $n_l\times n_m$, $X_l$ and $B_l$ are sub-vectors of $x$ and $b$, respectively, of size $n_l$ each and $\sum_{l} n_l=\sum_{m} n_m=n$. The multisplitting method proceeds by iteration to solve in parallel the linear system on $L$ clusters of processors, in such a way each sub-system \begin{equation} @@ -327,7 +328,10 @@ lat latency, \dots{}). \centering \caption{2 clusters X 50 nodes} \label{tab.cluster.2x50} - \AG{Les images manquent dans le dépôt Git. Si ce sont vraiment des tableaux, utiliser un format vectoriel (eps ou pdf), et surtout pas de jpeg!} + \AG{Ces tableaux (\ref{tab.cluster.2x50}, \ref{tab.cluster.3x33} et + \ref{tab.cluster.3x67}) sont affreux. Utiliser un format vectoriel (eps ou + pdf) ou, mieux, les réécrire en \LaTeX{}. Réécrire les légendes proprement + également (\texttt{\textbackslash{}times} au lieu de \texttt{X} par ex.)} \includegraphics[width=209pt]{img1.jpg} \end{table} @@ -335,7 +339,7 @@ lat latency, \dots{}). \centering \caption{3 clusters X 33 nodes} \label{tab.cluster.3x33} - \AG{Le fichier manque.} + \AG{Refaire le tableau.} \includegraphics[width=209pt]{img2.jpg} \end{table} @@ -343,7 +347,7 @@ lat latency, \dots{}). \centering \caption{3 clusters X 67 nodes} \label{tab.cluster.3x67} - \AG{Le fichier manque.} + \AG{Refaire le tableau.} % \includegraphics[width=160pt]{img3.jpg} \includegraphics[scale=0.5]{img3.jpg} \end{table} @@ -373,7 +377,7 @@ For the 3 clusters architecture including a total of 100 hosts, Table~\ref{tab.c that it was difficult to have a combination which gives an efficiency of asynchronous below \np[\%]{80}. Indeed, for a matrix size of 62 elements, equality between the performance of the two modes (synchronous and asynchronous) is -achieved with an inter cluster of \np[Mbits/s]{10} and a latency of \np{E-1} ms. To +achieved with an inter cluster of \np[Mbits/s]{10} and a latency of \np[ms]{E-1}. To challenge an efficiency by \np[\%]{78} with a matrix size of 100 points, it was necessary to degrade the inter cluster network bandwidth from 5 to 2 Mbit/s. @@ -403,7 +407,7 @@ executing the algorithm in asynchronous mode. \setcounter{numberedCntD}{\theenumi} \end{enumerate} Our results have shown that in certain conditions, asynchronous mode is -speeder up to 40 \% than executing the algorithm in synchronous mode +speeder up to \np[\%]{40} than executing the algorithm in synchronous mode which is not negligible for solving complex practical problems with more and more increasing size.