X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/hpcc2014.git/blobdiff_plain/74faf54391ee09cf05d205964b861e91ee559d74..a9757b4dc9aad25ed2dc6884c0bd638a0f101c31:/hpcc.tex diff --git a/hpcc.tex b/hpcc.tex index 1dc732c..5ab9faf 100644 --- a/hpcc.tex +++ b/hpcc.tex @@ -4,7 +4,7 @@ \usepackage[utf8]{inputenc} \usepackage{amsfonts,amssymb} \usepackage{amsmath} -\usepackage{algorithm} +%\usepackage{algorithm} \usepackage{algpseudocode} %\usepackage{amsthm} \usepackage{graphicx} @@ -169,16 +169,75 @@ our future work after the results. \section{The asynchronous iteration model} -\DL{Décrire le modèle asynchrone. Je m'en charge} +As exposed in the introduction, parallel iterative methods are now +widely used in many scientific domains. They can be classified in three main classes +depending on how iterations and communications are managed (for more +details readers can refer to \cite{bcvc02:ip}). In the +\textit{Synchronous Iterations - Synchronous Communications (SISC)} +model data are exchanged at the end of each iteration. All the +processors must begin the same iteration at the same time and +important idle times on processors are generated. The +\textit{Synchronous Iterations - Asynchronous Communications (SIAC)} +model can be compared to the previous one except that data required on +another processor are sent asynchronously i.e. without stopping +current computations. This technique allows to partially overlap +communications by computations but unfortunately, the overlapping is +only partial and important idle times remain. It is clear that, in a +grid computing context, where the number of computational nodes is large, +heterogeneous and widely distributed, the idle times generated by +synchronizations are very penalizing. One way to overcome this problem +is to use the \textit{Asynchronous Iterations - Asynchronous + Communications (AIAC)} model. Here, local computations do not need +to wait for required data. Processors can then perform their +iterations with the data present at that time. Figure \ref{fig:aiac} +illustrates this model where the grey blocks represent the computation +phases, the white spaces the idle times and the arrows the +communications. With this algorithmic model, the number of iterations +required before the convergence is generally greater than for the two +former classes. But, and as detailed in \cite{bcvc06:ij}, AIAC +algorithms can significantly reduce overall execution times by +suppressing idle times due to synchronizations especially in a grid +computing context. + +\begin{figure}[htbp] + \centering + \includegraphics[width=8cm]{AIAC.pdf} + \caption{The Asynchronous Iterations - Asynchronous Communications model } + \label{fig:aiac} +\end{figure} + + + \section{SimGrid} -\AG{Décrire SimGrid~\cite{casanova+legrand+quinson.2008.simgrid,SimGrid} (Arnaud)} +SimGrid~\cite{casanova+legrand+quinson.2008.simgrid,SimGrid} is a simulation +framework to sudy the behavior of large-scale distributed systems. As its name +says, it emanates from the grid computing community, but is nowadays used to +study grids, clouds, HPC or peer-to-peer systems. +%- open source, developped since 1999, one of the major solution in the field +% +SimGrid provides several programming interfaces: MSG to simulate Concurrent +Sequential Processes, SimDAG to simulate DAGs of (parallel) tasks, and SMPI to +run real applications written in MPI~\cite{MPI}. Apart from the native C +interface, SimGrid provides bindings for the C++, Java, Lua and Ruby programming +languages. The SMPI interface supports applications written in C or Fortran, +with little or no modifications. +%- implements most of MPI-2 \cite{ref} standard [CHECK] + +%%% explain simulation +%- simulated processes folded in one real process +%- simulates interactions on the network, fluid model +%- able to skip long-lasting computations +%- traces + visu? -%%% brief history? -%%% programming interfaces: MSG, SimDAG, SMPI %%% platforms -%%% validation? +%- describe resources and their interconnection, with their properties +%- XML files + +%%% validation + refs + +\AG{Décrire SimGrid~\cite{casanova+legrand+quinson.2008.simgrid,SimGrid} (Arnaud)} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Simulation of the multisplitting method} @@ -216,8 +275,9 @@ Y_l = B_l - \displaystyle\sum_{\substack{m=1\\ m\neq l}}^{L}A_{lm}X_m \end{equation} is solved independently by a cluster and communications are required to update the right-hand side sub-vector $Y_l$, such that the sub-vectors $X_m$ represent the data dependencies between the clusters. As each sub-system (\ref{eq:4.1}) is solved in parallel by a cluster of processors, our multisplitting method uses an iterative method as an inner solver which is easier to parallelize and more scalable than a direct method. In this work, we use the parallel algorithm of GMRES method~\cite{ref1} which is one of the most used iterative method by many researchers. -\begin{algorithm} -\caption{A multisplitting solver with GMRES method} +\begin{figure} + %%% IEEE instructions forbid to use an algorithm environment here, use figure + %%% instead \begin{algorithmic}[1] \Input $A_l$ (sparse sub-matrix), $B_l$ (right-hand side sub-vector) \Output $X_l$ (solution sub-vector)\vspace{0.2cm} @@ -238,10 +298,23 @@ is solved independently by a cluster and communications are required to update t \State \Return $X_l^k$ \EndFunction \end{algorithmic} +\caption{A multisplitting solver with GMRES method} \label{algo:01} -\end{algorithm} +\end{figure} -Algorithm~\ref{algo:01} shows the main key points of the multisplitting method to solve a large sparse linear system. This algorithm is based on an outer-inner iteration method where the parallel synchronous GMRES method is used to solve the inner iteration. It is executed in parallel by each cluster of processors. For all $l,m\in\{1,\ldots,L\}$, the matrices and vectors with the subscript $l$ represent the local data for cluster $l$, while $\{A_{lm}\}_{m\neq l}$ are off-diagonal matrices of sparse matrix $A$ and $\{X_m\}_{m\neq l}$ contain vector elements of solution $x$ shared with neighboring clusters. At every outer iteration $k$, asynchronous communications are performed between processors of the local cluster and those of distant clusters (lines $6$ and $7$ in Algorithm~\ref{algo:01}). The shared vector elements of the solution $x$ are exchanged by message passing using MPI non-blocking communication routines. +Algorithm on Figure~\ref{algo:01} shows the main key points of the +multisplitting method to solve a large sparse linear system. This algorithm is +based on an outer-inner iteration method where the parallel synchronous GMRES +method is used to solve the inner iteration. It is executed in parallel by each +cluster of processors. For all $l,m\in\{1,\ldots,L\}$, the matrices and vectors +with the subscript $l$ represent the local data for cluster $l$, while +$\{A_{lm}\}_{m\neq l}$ are off-diagonal matrices of sparse matrix $A$ and +$\{X_m\}_{m\neq l}$ contain vector elements of solution $x$ shared with +neighboring clusters. At every outer iteration $k$, asynchronous communications +are performed between processors of the local cluster and those of distant +clusters (lines $6$ and $7$ in Figure~\ref{algo:01}). The shared vector +elements of the solution $x$ are exchanged by message passing using MPI +non-blocking communication routines. \begin{figure} \centering