X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/Krylov_multi.git/blobdiff_plain/3933b25f95c0d8f65bfd2778425ac43c3ed67f20..71d655a4cb613d91906ca8c723f61a78cae26b0e:/krylov_multi.tex diff --git a/krylov_multi.tex b/krylov_multi.tex index 48d9e2e..1b8b43e 100644 --- a/krylov_multi.tex +++ b/krylov_multi.tex @@ -145,12 +145,12 @@ solvers on exascale architecture with hunders of thousands of cores. \section{A two-stage method with a minimization} -Let $Ax=b$ be a given sparse and large linear system of $n$ equations -to solve in parallel on $L$ clusters, physically adjacent or geographically -distant, where $A\in\mathbb{R}^{n\times n}$ is a square and nonsingular -matrix, $x\in\mathbb{R}^{n}$ is the solution vector and $b\in\mathbb{R}^{n}$ -is the right-hand side vector. The multisplitting of this linear system -is defined as follows: +Let $Ax=b$ be a given sparse and large linear system of $n$ equations +to solve in parallel on $L$ clusters, physically adjacent or +geographically distant, where $A\in\mathbb{R}^{n\times n}$ is a square +and nonsingular matrix, $x\in\mathbb{R}^{n}$ is the solution vector +and $b\in\mathbb{R}^{n}$ is the right-hand side vector. The +multisplitting of this linear system is defined as follows: \begin{equation} \left\{ \begin{array}{lll} @@ -161,21 +161,24 @@ b & = & [B_{1}, \ldots, B_{L}] \right. \label{sec03:eq01} \end{equation} -where for $l\in\{1,\ldots,L\}$ $A_l$ is a rectangular block of size $n_l\times n$ -and $X_l$ and $B_l$ are sub-vectors of size $n_l$, such that $\sum_ln_l=n$. In this -case, we use a row-by-row splitting without overlapping in such a way that successive -rows of the sparse matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster. -So, the multisplitting format of the linear system is defined as follows: +where for $l\in\{1,\ldots,L\}$, $A_l$ is a rectangular block of size +$n_l\times n$ and $X_l$ and $B_l$ are sub-vectors of size $n_l$, such +that $\sum_ln_l=n$. In this case, we use a row-by-row splitting +without overlapping in such a way that successive rows of the sparse +matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster. +So, the multisplitting format of the linear system is defined as +follows: \begin{equation} \forall l\in\{1,\ldots,L\} \mbox{,~} \displaystyle\sum_{i=1}^{l-1}A_{li}X_i + A_{ll}X_l + \displaystyle\sum_{i=l+1}^{L}A_{li}X_i = B_l, \label{sec03:eq02} \end{equation} -where $A_{li}$ is a block of size $n_l\times n_i$ of the rectangular matrix $A_l$, $X_i\neq X_l$ -is a sub-vector of size $n_i$ of the solution vector $x$ and $\sum_{il}n_i+n_l=n$, -for all $i\in\{1,\ldots,l-1,l+1,\ldots,L\}$. +where $A_{li}$ is a block of size $n_l\times n_i$ of the rectangular +matrix $A_l$, $X_i\neq X_l$ is a sub-vector of size $n_i$ of the +solution vector $x$ and $\sum_{il}n_i+n_l=n$, for all +$i\in\{1,\ldots,l-1,l+1,\ldots,L\}$. -The multisplitting method proceeds by iteration for solving the linear system in such a -way each sub-system +The multisplitting method proceeds by iteration for solving the linear +system in such a way each sub-system \begin{equation} \left\{ \begin{array}{l} @@ -185,14 +188,69 @@ Y_l = B_l - \displaystyle\sum_{i=1,i\neq l}^{L}A_{li}X_i, \right. \label{sec03:eq03} \end{equation} -is solved independently by a cluster of processors and communication are required to -update the right-hand side vectors $Y_l$, such that the vectors $X_i$ represent the data -dependencies between the clusters. In this work, we use the GMRES method as an inner -iteration method for solving the sub-systems~(\ref{sec03:eq03}). It is a well-known -iterative method which gives good performances for solving sparse linear systems in -parallel on a cluster of processors. - - +is solved independently by a cluster of processors and communication +are required to update the right-hand side vectors $Y_l$, such that +the vectors $X_i$ represent the data dependencies between the +clusters. In this work, we use the GMRES method as an inner iteration +method for solving the sub-systems~(\ref{sec03:eq03}). It is a +well-known iterative method which gives good performances for solving +sparse linear systems in parallel on a cluster of processors. + +It should be noted that the convergence of the inner iterative solver +for the different linear sub-systems~(\ref{sec03:eq03}) does not +necessarily involve the convergence of the multisplitting method. It +strongly depends on the properties of the sparse linear system to be +solved and the computing +environment~\cite{o1985multi,ref18}. Furthermore, the multisplitting +of the linear system among several clusters of processors increases +the spectral radius of the iteration matrix, thereby slowing the +convergence. In this paper, we based on the work presented +in~\cite{huang1993krylov} to increase the convergence and improve the +scalability of the multisplitting methods. + +In order to accelerate the convergence, we implement the outer +iteration of the multisplitting solver as a Krylov subspace method +which minimizes some error function over a Krylov subspace~\cite{S96}. +The Krylov space of the method that we used is spanned by a basis +composed of successive solutions issued from solving the $L$ +splittings~(\ref{sec03:eq03}) +\begin{equation} +S=\{x^1,x^2,\ldots,x^s\},~s\leq n, +\label{sec03:eq04} +\end{equation} +where for $k\in\{1,\ldots,s\}$, $x^k=[X_1^k,\ldots,X_L^k]$ is a +solution of the global linear system.%The advantage such a method is that the Krylov subspace does not need to be spanned by an orthogonal basis. +The advantage of such a Krylov subspace is that we need neither an +orthogonal basis nor synchronizations between the different clusters +to generate this basis. + +The multisplitting method is periodically restarted every $s$ iterations +with a new initial guess $\tilde{x}=S\alpha$ which minimizes the error +function $\|b-Ax\|_2$ over the Krylov subspace spanned by the vectors of $S$. +So, $\alpha$ is defined as the solution of the large overdetermined linear system +\begin{equation} +B\alpha=b, +\label{sec03:eq05} +\end{equation} +where $B=AS$ is a dense rectangular matrix of size $n\times s$ and $s\ll n$. This leads +us to solve the system of normal equations +\begin{equation} +B^TB\alpha=B^Tb, +\label{sec03:eq06} +\end{equation} +which is associated with the least squares problem +\begin{equation} +\text{minimize}~\|b-B\alpha\|_2, +\label{sec03:eq07} +\end{equation} +where $B^T$ denotes the transpose of the matrix $B$. Since $B$ (i.e. $AS$) and +$b$ are split among $L$ clusters, the symmetric positive definite system~(\ref{sec03:eq06}) +is solved in parallel. Thus, an iterative method would be more appropriate than +a direct one for solving this system. We use the parallel conjugate gradient +method for the normal equations CGNR~\cite{S96,refCGNR}. + +%%% Ecrire l'algorithme(s) +%%% Parler des synchronisations entre proc et clusters