-synchronizations especially in a grid computing context (see \cite{bcvc06:ij} for more details).
-
-Parallel numerical applications (synchronous or asynchronous) may have different configuration and deployment
-requirements. Quantifying their resource allocation policies and application scheduling algorithms in
-grid computing environments under varying load, CPU power and network speeds is very costly, very labor intensive and very time
-consuming \cite{BuRaCa}. The case of AIAC algorithms is even more problematic since they are very sensible to the
-execution environment context. For instance, variations in the network bandwith (intra and inter- clusters), in the
-number and the power of nodes, in the number of clusters... can lead to very different number of iterations and so to
-very different execution times. Then, it appears that the use of simulation tools to explore various platform
-scenarios and to run large numbers of experiments quickly can be very promising. In this way, the use of a simulation
-environment to execute parallel iterative algorithms found some interests in reducing the highly cost of access to
-computing resources: (1) for the applications development life cycle and in code debugging (2) and in production to get
-results in a reasonable execution time with a simulated infrastructure not accessible with physical resources. Indeed,
-the launch of distributed iterative asynchronous algorithms to solve a given problem on a large-scale simulated
-environment challenges to find optimal configurations giving the best results with a lowest residual error and in the
-best of execution time.
-
-To our knowledge, there is no existing work on the large-scale simulation of a real AIAC application. The aim of this
-paper is twofold. First we give a first approach of the simulation of AIAC algorithms using a simulation tool (i.e. the
-SimGrid toolkit \cite{SimGrid}). Second, we confirm the effectiveness of asynchronous mode algorithms by comparing their
-performance with the synchronous mode. More precisely, we had implemented a program for solving large non-symmetric
-linear system of equations by numerical method GMRES (Generalized Minimal Residual) []. We show, that with minor
-modifications of the initial MPI code, the SimGrid toolkit allows us to perform a test campain of a real AIAC
-application on different computing architectures. The simulated results we obtained are in line with real results
-exposed in ??. SimGrid had allowed us to launch the application from a modest computing infrastructure by simulating
-different distributed architectures composed by clusters nodes interconnected by variable speed networks. It has been
-permitted to show With selected parameters on the network platforms (bandwidth, latency of inter cluster network) and
-on the clusters architecture (number, capacity calculation power) in the simulated environment, the experimental results
-have demonstrated not only the algorithm convergence within a reasonable time compared with the physical environment
-performance, but also a time saving of up to \np[\%]{40} in asynchronous mode.
+synchronizations especially in a grid computing context (see~\cite{Bahi07} for more details).
+
+Parallel numerical applications (synchronous or asynchronous) may have different
+configuration and deployment requirements. Quantifying their resource
+allocation policies and application scheduling algorithms in grid computing
+environments under varying load, CPU power and network speeds is very costly,
+very labor intensive and very time
+consuming~\cite{Calheiros:2011:CTM:1951445.1951450}. The case of AIAC
+algorithms is even more problematic since they are very sensible to the
+execution environment context. For instance, variations in the network bandwidth
+(intra and inter-clusters), in the number and the power of nodes, in the number
+of clusters\dots{} can lead to very different number of iterations and so to
+very different execution times. Then, it appears that the use of simulation
+tools to explore various platform scenarios and to run large numbers of
+experiments quickly can be very promising. In this way, the use of a simulation
+environment to execute parallel iterative algorithms found some interests in
+reducing the highly cost of access to computing resources: (1) for the
+applications development life cycle and in code debugging (2) and in production
+to get results in a reasonable execution time with a simulated infrastructure
+not accessible with physical resources. Indeed, the launch of distributed
+iterative asynchronous algorithms to solve a given problem on a large-scale
+simulated environment challenges to find optimal configurations giving the best
+results with a lowest residual error and in the best of execution time.
+
+To our knowledge, there is no existing work on the large-scale simulation of a
+real AIAC application. The aim of this paper is twofold. First we give a first
+approach of the simulation of AIAC algorithms using a simulation tool (i.e. the
+SimGrid toolkit~\cite{SimGrid}). Second, we confirm the effectiveness of
+asynchronous mode algorithms by comparing their performance with the synchronous
+mode. More precisely, we had implemented a program for solving large
+non-symmetric linear system of equations by numerical method GMRES (Generalized
+Minimal Residual) []\AG[]{[]?}. We show, that with minor modifications of the
+initial MPI code, the SimGrid toolkit allows us to perform a test campaign of a
+real AIAC application on different computing architectures. The simulated
+results we obtained are in line with real results exposed in ??\AG[]{??}.
+SimGrid had allowed us to launch the application from a modest computing
+infrastructure by simulating different distributed architectures composed by
+clusters nodes interconnected by variable speed networks. With selected
+parameters on the network platforms (bandwidth, latency of inter cluster
+network) and on the clusters architecture (number, capacity calculation power)
+in the simulated environment, the experimental results have demonstrated not
+only the algorithm convergence within a reasonable time compared with the
+physical environment performance, but also a time saving of up to \np[\%]{40} in
+asynchronous mode.