\usepackage[textsize=footnotesize]{todonotes}
\newcommand{\AG}[2][inline]{%
\todo[color=green!50,#1]{\sffamily\textbf{AG:} #2}\xspace}
-\newcommand{\RC}[2][inline]{%
- \todo[color=red!10,#1]{\sffamily\textbf{RC:} #2}\xspace}
+\newcommand{\DL}[2][inline]{%
+ \todo[color=yellow!50,#1]{\sffamily\textbf{DL:} #2}\xspace}
\newcommand{\LZK}[2][inline]{%
\todo[color=blue!10,#1]{\sffamily\textbf{LZK:} #2}\xspace}
+\newcommand{\RC}[2][inline]{%
+ \todo[color=red!10,#1]{\sffamily\textbf{RC:} #2}\xspace}
\algnewcommand\algorithmicinput{\textbf{Input:}}
\algnewcommand\Input{\item[\algorithmicinput]}
\author{%
\IEEEauthorblockN{%
- Charles Emile Ramamonjisoa and
- David Laiymani and
- Arnaud Giersch and
- Lilia Ziane Khodja and
- Raphaël Couturier
+ Charles Emile Ramamonjisoa\IEEEauthorrefmark{1},
+ David Laiymani\IEEEauthorrefmark{1},
+ Arnaud Giersch\IEEEauthorrefmark{1},
+ Lilia Ziane Khodja\IEEEauthorrefmark{2} and
+ Raphaël Couturier\IEEEauthorrefmark{1}
+ }
+ \IEEEauthorblockA{\IEEEauthorrefmark{1}%
+ Femto-ST Institute -- DISC Department\\
+ Université de Franche-Comté,
+ IUT de Belfort-Montbéliard\\
+ 19 avenue du Maréchal Juin, BP 527, 90016 Belfort cedex, France\\
+ Email: \email{{charles.ramamonjisoa,david.laiymani,arnaud.giersch,raphael.couturier}@univ-fcomte.fr}
}
- \IEEEauthorblockA{%
- Femto-ST Institute - DISC Department\\
- Université de Franche-Comté\\
- Belfort\\
- Email: \email{{raphael.couturier,arnaud.giersch,david.laiymani,charles.ramamonjisoa}@univ-fcomte.fr}
+ \IEEEauthorblockA{\IEEEauthorrefmark{2}%
+ Inria Bordeaux Sud-Ouest\\
+ 200 avenue de la Vieille Tour, 33405 Talence cedex, France \\
+ Email: \email{lilia.ziane@inria.fr}
}
}
\maketitle
\RC{Ordre des autheurs pas définitif.}
-\LZK{Adresse de Lilia: Inria Bordeaux Sud-Ouest, 200 Avenue de la Vieille Tour, 33405 Talence Cedex, France \\ Email: lilia.ziane@inria.fr}
\begin{abstract}
-The abstract goes here.
+ABSTRACT
+
+In recent years, the scalability of large-scale implementation in a
+distributed environment of algorithms becoming more and more complex has
+always been hampered by the limits of physical computing resources
+capacity. One solution is to run the program in a virtual environment
+simulating a real interconnected computers architecture. The results are
+convincing and useful solutions are obtained with far fewer resources
+than in a real platform. However, challenges remain for the convergence
+and efficiency of a class of algorithms that concern us here, namely
+numerical parallel iterative algorithms executed in asynchronous mode,
+especially in a large scale level. Actually, such algorithm requires a
+balance and a compromise between computation and communication time
+during the execution. Two important factors determine the success of the
+experimentation: the convergence of the iterative algorithm on a large
+scale and the execution time reduction in asynchronous mode. Once again,
+from the current work, a simulated environment like Simgrid provides
+accurate results which are difficult or even impossible to obtain in a
+physical platform by exploiting the flexibility of the simulator on the
+computing units clusters and the network structure design. Our
+experimental outputs showed a saving of up to \np[\%]{40} for the algorithm
+execution time in asynchronous mode compared to the synchronous one with
+a residual precision up to \np{E-11}. Such successful results open
+perspectives on experimentations for running the algorithm on a
+simulated large scale growing environment and with larger problem size.
+
+% no keywords for IEEE conferences
+% Keywords: Algorithm distributed iterative asynchronous simulation simgrid
\end{abstract}
\section{Introduction}
\section{The asynchronous iteration model}
-Décrire le modèle asynchrone. Je m'en charge (DL)
+\DL{Décrire le modèle asynchrone. Je m'en charge}
\section{SimGrid}
-Décrire SimGrid~\cite{casanova+legrand+quinson.2008.simgrid} (Arnaud)
-
-
-
-
-
+\AG{Décrire SimGrid~\cite{casanova+legrand+quinson.2008.simgrid} (Arnaud)}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\vdots\\
B_L
\end{array} \right)\]
-in such a way that successive rows of matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster, where for all $l,m\in\{1,\ldots,L\}$ $A_{lm}$ is a rectangular block of $A$ of size $n_l\times n_m$, $X_l$ and $B_l$ are sub-vectors of $x$ and $b$, respectively, each of size $n_l$ and $\sum_{l} n_l=\sum_{m} n_m=n$.
+in such a way that successive rows of matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster, where for all $l,m\in\{1,\ldots,L\}$ $A_{lm}$ is a rectangular block of $A$ of size $n_l\times n_m$, $X_l$ and $B_l$ are sub-vectors of $x$ and $b$, respectively, of size $n_l$ each and $\sum_{l} n_l=\sum_{m} n_m=n$.
The multisplitting method proceeds by iteration to solve in parallel the linear system on $L$ clusters of processors, in such a way each sub-system
\begin{equation}
\centering
\caption{2 clusters X 50 nodes}
\label{tab.cluster.2x50}
- \AG{Les images manquent dans le dépôt Git. Si ce sont vraiment des tableaux, utiliser un format vectoriel (eps ou pdf), et surtout pas de jpeg!}
+ \AG{Ces tableaux (\ref{tab.cluster.2x50}, \ref{tab.cluster.3x33} et
+ \ref{tab.cluster.3x67}) sont affreux. Utiliser un format vectoriel (eps ou
+ pdf) ou, mieux, les réécrire en \LaTeX{}. Réécrire les légendes proprement
+ également (\texttt{\textbackslash{}times} au lieu de \texttt{X} par ex.)}
\includegraphics[width=209pt]{img1.jpg}
\end{table}
\centering
\caption{3 clusters X 33 nodes}
\label{tab.cluster.3x33}
- \AG{Le fichier manque.}
+ \AG{Refaire le tableau.}
\includegraphics[width=209pt]{img2.jpg}
\end{table}
\centering
\caption{3 clusters X 67 nodes}
\label{tab.cluster.3x67}
- \AG{Le fichier manque.}
+ \AG{Refaire le tableau.}
% \includegraphics[width=160pt]{img3.jpg}
\includegraphics[scale=0.5]{img3.jpg}
\end{table}
that it was difficult to have a combination which gives an efficiency of
asynchronous below \np[\%]{80}. Indeed, for a matrix size of 62 elements, equality
between the performance of the two modes (synchronous and asynchronous) is
-achieved with an inter cluster of \np[Mbits/s]{10} and a latency of \np{E-1} ms. To
+achieved with an inter cluster of \np[Mbits/s]{10} and a latency of \np[ms]{E-1}. To
challenge an efficiency by \np[\%]{78} with a matrix size of 100 points, it was
necessary to degrade the inter cluster network bandwidth from 5 to 2 Mbit/s.
-A last attempt was made for a configuration of three clusters but more power
+A last attempt was made for a configuration of three clusters but more powerful
with 200 nodes in total. The convergence with a speedup of \np[\%]{90} was obtained
with a bandwidth of \np[Mbits/s]{1} as shown in Table~\ref{tab.cluster.3x67}.
\section{Conclusion}
+CONCLUSION
+
+The experimental results on executing a parallel iterative algorithm in
+asynchronous mode on an environment simulating a large scale of virtual
+computers organized with interconnected clusters have been presented.
+Our work has demonstrated that using such a simulation tool allow us to
+reach the following three objectives:
+
+\newcounter{numberedCntD}
+\begin{enumerate}
+\item To have a flexible configurable execution platform resolving the
+hard exercise to access to very limited but so solicited physical
+resources;
+\item to ensure the algorithm convergence with a raisonnable time and
+iteration number ;
+\item and finally and more importantly, to find the correct combination
+of the cluster and network specifications permitting to save time in
+executing the algorithm in asynchronous mode.
+\setcounter{numberedCntD}{\theenumi}
+\end{enumerate}
+Our results have shown that in certain conditions, asynchronous mode is
+speeder up to \np[\%]{40} than executing the algorithm in synchronous mode
+which is not negligible for solving complex practical problems with more
+and more increasing size.
+
+ Several studies have already addressed the performance execution time of
+this class of algorithm. The work presented in this paper has
+demonstrated an original solution to optimize the use of a simulation
+tool to run efficiently an iterative parallel algorithm in asynchronous
+mode in a grid architecture.
\section*{Acknowledgment}