X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/Krylov_multi.git/blobdiff_plain/0d162d0a192a8e0e159e831ba451862a05260ee5..774e3ae76fae2fcab8efe8bf6bc2b61c0da65e09:/krylov_multi.tex diff --git a/krylov_multi.tex b/krylov_multi.tex index ea6cfe9..8685bd2 100644 --- a/krylov_multi.tex +++ b/krylov_multi.tex @@ -61,8 +61,13 @@ solvers. However, most of the good preconditionners are not sclalable when thousands of cores are used. -A completer... -On ne peut pas parler de tout...\\ +Traditionnal iterative solvers have global synchronizations that penalize the +scalability. Two possible solutions consists either in using asynchronous +iterative methods~\cite{ref18} or to use multisplitting algorithms. In this +paper, we will reconsider the use of a multisplitting method. In opposition to +traditionnal multisplitting method that suffer from slow convergence, as +proposed in~\cite{huang1993krylov}, the use of a minimization process can +drastically improve the convergence. @@ -201,7 +206,7 @@ is solved independently by a cluster of processors and communication are required to update the right-hand side vectors $Y_l$, such that the vectors $X_i$ represent the data dependencies between the clusters. In this work, we use the parallel GMRES method~\cite{ref34} -as an inner iteration method for solving the +as an inner iteration method to solve the sub-systems~(\ref{sec03:eq03}). It is a well-known iterative method which gives good performances for solving sparse linear systems in parallel on a cluster of processors. @@ -229,7 +234,10 @@ S=\{x^1,x^2,\ldots,x^s\},~s\leq n, \label{sec03:eq04} \end{equation} where for $j\in\{1,\ldots,s\}$, $x^j=[X_1^j,\ldots,X_L^j]$ is a -solution of the global linear system. The advantage of such a Krylov subspace is that we need neither an orthogonal basis nor synchronizations between the different clusters to generate this basis. +solution of the global linear system. The advantage of such a Krylov +subspace is that we need neither an orthogonal basis nor +synchronizations between the different clusters to generate this +basis. The multisplitting method is periodically restarted every $s$ iterations with a new initial guess $\tilde{x}=S\alpha$ which @@ -251,12 +259,12 @@ which is associated with the least squares problem \text{minimize}~\|b-R\alpha\|_2, \label{sec03:eq07} \end{equation} -where $R^T$ denotes the transpose of the matrix $R$. Since $R$ -(i.e. $AS$) and $b$ are split among $L$ clusters, the symmetric -positive definite system~(\ref{sec03:eq06}) is solved in -parallel. Thus, an iterative method would be more appropriate than a -direct one for solving this system. We use the parallel conjugate -gradient method for the normal equations CGNR~\cite{S96,refCGNR}. +where $R^T$ denotes the transpose of the matrix $R$. Since $R$ (i.e. +$AS$) and $b$ are split among $L$ clusters, the symmetric positive +definite system~(\ref{sec03:eq06}) is solved in parallel. Thus, an +iterative method would be more appropriate than a direct one to solve +this system. We use the parallel conjugate gradient method for the +normal equations CGNR~\cite{S96,refCGNR}. \begin{algorithm}[!t] \caption{A two-stage linear solver with inner iteration GMRES method} @@ -296,16 +304,16 @@ gradient method for the normal equations CGNR~\cite{S96,refCGNR}. \label{algo:01} \end{algorithm} -The main key points of the multisplitting method for solving large -sparse linear systems are given in Algorithm~\ref{algo:01}. This +The main key points of the multisplitting method to solve a large +sparse linear system are given in Algorithm~\ref{algo:01}. This algorithm is based on a two-stage method with a minimization using the GMRES iterative method as an inner solver. It is executed in parallel by each cluster of processors. The matrices and vectors with the subscript $l$ represent the local data for the cluster $l$, where $l\in\{1,\ldots,L\}$. The two-stage solver uses two different parallel -iterative algorithms: the GMRES method for solving each splitting on a +iterative algorithms: the GMRES method to solve each splitting on a cluster of processors, and the CGNR method executed in parallel by all -clusters for minimizing the function error over the Krylov subspace +clusters to minimize the function error over the Krylov subspace spanned by $S$. The algorithm requires two global synchronizations between the $L$ clusters. The first one is performed at line~$12$ in Algorithm~\ref{algo:01} to exchange the local values of the vector