X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/hpcc2014.git/blobdiff_plain/4842d834e57f085b1a7e7afb2546af0b3ad652fb..70f82cf5e87fbbebce020a9163579a602385a7ff:/hpcc.tex?ds=inline

diff --git a/hpcc.tex b/hpcc.tex
index 47d8102..a5f2768 100644
--- a/hpcc.tex
+++ b/hpcc.tex
@@ -45,26 +45,59 @@
 
 \author{%
   \IEEEauthorblockN{%
-    Charles Emile Ramamonjisoa and
-    David Laiymani and
-    Arnaud Giersch and
-    Lilia Ziane Khodja and
-    RaphaÃ«l Couturier
+    Charles Emile Ramamonjisoa\IEEEauthorrefmark{1},
+    David Laiymani\IEEEauthorrefmark{1},
+    Arnaud Giersch\IEEEauthorrefmark{1},
+    Lilia Ziane Khodja\IEEEauthorrefmark{2} and
+    RaphaÃ«l Couturier\IEEEauthorrefmark{1}
   }
-  \IEEEauthorblockA{%
-    Femto-ST Institute - DISC Department\\
-    UniversitÃ© de Franche-ComtÃ©\\
-    Belfort\\
-    Email: \email{{raphael.couturier,arnaud.giersch,david.laiymani,charles.ramamonjisoa}@univ-fcomte.fr}
+  \IEEEauthorblockA{\IEEEauthorrefmark{1}%
+    Femto-ST Institute -- DISC Department\\
+    UniversitÃ© de Franche-ComtÃ©,
+    IUT de Belfort-MontbÃ©liard\\
+    19 avenue du MarÃ©chal Juin, BP 527, 90016 Belfort cedex, France\\
+    Email: \email{{charles.ramamonjisoa,david.laiymani,arnaud.giersch,raphael.couturier}@univ-fcomte.fr}
+  }
+  \IEEEauthorblockA{\IEEEauthorrefmark{2}%
+    Inria Bordeaux Sud-Ouest\\
+    200 avenue de la Vieille Tour, 33405 Talence cedex, France \\
+    Email: \email{lilia.ziane@inria.fr}
   }
 }
 
 \maketitle
 
 \RC{Ordre des autheurs pas dÃ©finitif.}
-\LZK{Adresse de Lilia: Inria Bordeaux Sud-Ouest, 200 Avenue de la Vieille Tour, 33405 Talence Cedex, France \\ Email: lilia.ziane@inria.fr}
 \begin{abstract}
-The abstract goes here.
+ABSTRACT
+
+In recent years, the scalability of large-scale implementation in a 
+distributed environment of algorithms becoming more and more complex has 
+always been hampered by the limits of physical computing resources 
+capacity. One solution is to run the program in a virtual environment 
+simulating a real interconnected computers architecture. The results are 
+convincing and useful solutions are obtained with far fewer resources 
+than in a real platform. However, challenges remain for the convergence 
+and efficiency of a class of algorithms that concern us here, namely 
+numerical parallel iterative algorithms executed in asynchronous mode, 
+especially in a large scale level. Actually, such algorithm requires a 
+balance and a compromise between computation and communication time 
+during the execution. Two important factors determine the success of the 
+experimentation: the convergence of the iterative algorithm on a large 
+scale and the execution time reduction in asynchronous mode. Once again, 
+from the current work, a simulated environment like Simgrid provides 
+accurate results which are difficult or even impossible to obtain in a 
+physical platform by exploiting the flexibility of the simulator on the 
+computing units clusters and the network structure design. Our 
+experimental outputs showed a saving of up to 40 \% for the algorithm 
+execution time in asynchronous mode compared to the synchronous one with 
+a residual precision up to E-11. Such successful results open 
+perspectives on experimentations for running the algorithm on a 
+simulated large scale growing environment and with larger problem size. 
+
+Keywords : Algorithm distributed iterative asynchronous simulation 
+simgrid
+
 \end{abstract}
 
 \section{Introduction}
@@ -171,7 +204,7 @@ B_1 \\
 \vdots\\
 B_L
 \end{array} \right)\] 
-in such a way that successive rows of matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster, where for all $l,m\in\{1,\ldots,L\}$ $A_{lm}$ is a rectangular block of $A$ of size $n_l\times n_m$, $X_l$ and $B_l$ are sub-vectors of $x$ and $b$, respectively, each of size $n_l$ and $\sum_{l} n_l=\sum_{m} n_m=n$.
+in such a way that successive rows of matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster, where for all $l,m\in\{1,\ldots,L\}$ $A_{lm}$ is a rectangular block of $A$ of size $n_l\times n_m$, $X_l$ and $B_l$ are sub-vectors of $x$ and $b$, respectively, of size $n_l$ each and $\sum_{l} n_l=\sum_{m} n_m=n$.
 
 The multisplitting method proceeds by iteration to solve in parallel the linear system on $L$ clusters of processors, in such a way each sub-system
 \begin{equation}
@@ -349,11 +382,41 @@ achieved with an inter cluster of \np[Mbits/s]{10} and a latency of \np{E-1} ms.
 challenge an efficiency by \np[\%]{78} with a matrix size of 100 points, it was
 necessary to degrade the inter cluster network bandwidth from 5 to 2 Mbit/s.
 
-A last attempt was made for a configuration of three clusters but more power
+A last attempt was made for a configuration of three clusters but more powerful
 with 200 nodes in total. The convergence with a speedup of \np[\%]{90} was obtained
 with a bandwidth of \np[Mbits/s]{1} as shown in Table~\ref{tab.cluster.3x67}.
 
 \section{Conclusion}
+CONCLUSION
+
+The experimental results on executing a parallel iterative algorithm in 
+asynchronous mode on an environment simulating a large scale of virtual 
+computers organized with interconnected clusters have been presented. 
+Our work has demonstrated that using such a simulation tool allow us to 
+reach the following three objectives: 
+
+\newcounter{numberedCntD}
+\begin{enumerate}
+\item To have a flexible configurable execution platform resolving the 
+hard exercise to access to very limited but so solicited physical 
+resources;
+\item to ensure the algorithm convergence with a raisonnable time and 
+iteration number ;
+\item and finally and more importantly, to find the correct combination 
+of the cluster and network specifications permitting to save time in 
+executing the algorithm in asynchronous mode.
+\setcounter{numberedCntD}{\theenumi}
+\end{enumerate}
+Our results have shown that in certain conditions, asynchronous mode is 
+speeder up to 40 \% than executing the algorithm in synchronous mode 
+which is not negligible for solving complex practical problems with more 
+and more increasing size.
+
+ Several studies have already addressed the performance execution time of 
+this class of algorithm. The work presented in this paper has 
+demonstrated an original solution to optimize the use of a simulation 
+tool to run efficiently an iterative parallel algorithm in asynchronous 
+mode in a grid architecture. 
 
 \section*{Acknowledgment}