11-01-2014 V1

[Krylov_multi.git] / krylov_multi.tex
diff --git a/krylov_multi.tex b/krylov_multi.tex

index 5a97c5f661129fa417082716709fdfdaf879072d..1b8b43e88a023a93d971a4113ee38a6d75c81452 100644 (file)
--- a/krylov_multi.tex
+++ b/krylov_multi.tex
@@ -145,12 +145,12 @@ solvers on exascale architecture with hunders of thousands of cores.
  
  
  \section{A two-stage method with a minimization}
-Let $Ax=b$ be a given sparse and large linear system of $n$ equations
-to solve in parallel on $L$ clusters, physically adjacent or geographically
-distant, where $A\in\mathbb{R}^{n\times n}$ is a square and nonsingular
-matrix, $x\in\mathbb{R}^{n}$ is the solution vector and $b\in\mathbb{R}^{n}$
-is the right-hand side vector. The multisplitting of this linear system 
-is defined as follows:
+Let $Ax=b$ be a given sparse  and large linear system of $n$ equations
+to  solve  in  parallel   on  $L$  clusters,  physically  adjacent  or
+geographically distant, where $A\in\mathbb{R}^{n\times n}$ is a square
+and  nonsingular matrix, $x\in\mathbb{R}^{n}$  is the  solution vector
+and   $b\in\mathbb{R}^{n}$  is   the  right-hand   side   vector.  The
+multisplitting of this linear system is defined as follows:
  \begin{equation}
  \left\{
  \begin{array}{lll}
@@ -161,21 +161,24 @@ b & = & [B_{1}, \ldots, B_{L}]
  \right.
  \label{sec03:eq01}
  \end{equation}  
-where for $l\in\{1,\ldots,L\}$, $A_l$ is a rectangular block of size $n_l\times n$
-and $X_l$ and $B_l$ are sub-vectors of size $n_l$, such that $\sum_ln_l=n$. In this
-case, we use a row-by-row splitting without overlapping in such a way that successive
-rows of the sparse matrix $A$ and both vectors $x$ and $b$ are assigned to one cluster.
-So, the multisplitting format of the linear system is defined as follows:
+where for  $l\in\{1,\ldots,L\}$, $A_l$ is a rectangular  block of size
+$n_l\times n$ and $X_l$ and  $B_l$ are sub-vectors of size $n_l$, such
+that  $\sum_ln_l=n$.  In this  case,  we  use  a row-by-row  splitting
+without overlapping in  such a way that successive  rows of the sparse
+matrix $A$ and  both vectors $x$ and $b$ are  assigned to one cluster.
+So,  the multisplitting  format of  the  linear system  is defined  as
+follows:
  \begin{equation}
  \forall l\in\{1,\ldots,L\} \mbox{,~} \displaystyle\sum_{i=1}^{l-1}A_{li}X_i + A_{ll}X_l + \displaystyle\sum_{i=l+1}^{L}A_{li}X_i = B_l, 
  \label{sec03:eq02}
  \end{equation} 
-where $A_{li}$ is a block of size $n_l\times n_i$ of the rectangular matrix $A_l$, $X_i\neq X_l$
-is a sub-vector of size $n_i$ of the solution vector $x$ and $\sum_{i<l}n_i+\sum_{i>l}n_i+n_l=n$,
-for all $i\in\{1,\ldots,l-1,l+1,\ldots,L\}$. 
+where $A_{li}$ is  a block of size $n_l\times  n_i$ of the rectangular
+matrix  $A_l$, $X_i\neq  X_l$ is  a sub-vector  of size  $n_i$  of the
+solution vector  $x$ and $\sum_{i<l}n_i+\sum_{i>l}n_i+n_l=n$,  for all
+$i\in\{1,\ldots,l-1,l+1,\ldots,L\}$.
  
-The multisplitting method proceeds by iteration for solving the linear system in such a
-way each sub-system
+The multisplitting method proceeds by iteration for solving the linear
+system in such a way each sub-system
  \begin{equation}
  \left\{
  \begin{array}{l}
@@ -185,37 +188,69 @@ Y_l = B_l - \displaystyle\sum_{i=1,i\neq l}^{L}A_{li}X_i,
  \right.
  \label{sec03:eq03}
  \end{equation}
-is solved independently by a cluster of processors and communication are required to
-update the right-hand side vectors $Y_l$, such that the vectors $X_i$ represent the data
-dependencies between the clusters. In this work, we use the GMRES method as an inner
-iteration method for solving the sub-systems~(\ref{sec03:eq03}). It is a well-known
-iterative method which gives good performances for solving sparse linear systems in
-parallel on a cluster of processors. 
-
-It should be noted that the convergence of the inner iterative solver for the different
-linear sub-systems~(\ref{sec03:eq03}) does not necessarily involve the convergence of the
-multisplitting method. It strongly depends on the properties of the sparse linear system
-to be solved and the computing environment~\cite{o1985multi,ref18}. Furthermore, the multisplitting
-of the linear system among several clusters of processors increases the spectral radius
-of the iteration matrix, thereby slowing the convergence. In this paper, we based on the
-work presented in~\cite{huang1993krylov} to increase the convergence and improve the
-scalability of the multisplitting methods. 
-
-In order to accelerate the convergence, we implement the outer iteration of the multisplitting
-solver as a Krylov subspace method which minimizes some error function over a Krylov subspace~\cite{S96}.
-The Krylov space of the method that we used is spanned by a basis composed of the solutions issued from
-solving the $L$ splittings~(\ref{sec03:eq03})
+is solved  independently by a cluster of  processors and communication
+are required  to update the  right-hand side vectors $Y_l$,  such that
+the  vectors  $X_i$  represent   the  data  dependencies  between  the
+clusters. In this work, we use  the GMRES method as an inner iteration
+method  for  solving   the  sub-systems~(\ref{sec03:eq03}).  It  is  a
+well-known iterative method which  gives good performances for solving
+sparse linear systems in parallel on a cluster of processors.
+
+It should be noted that  the convergence of the inner iterative solver
+for  the  different  linear  sub-systems~(\ref{sec03:eq03})  does  not
+necessarily involve  the convergence of the  multisplitting method. It
+strongly depends on  the properties of the sparse  linear system to be
+solved                 and                the                computing
+environment~\cite{o1985multi,ref18}.  Furthermore,  the multisplitting
+of the  linear system among  several clusters of  processors increases
+the  spectral radius  of  the iteration  matrix,  thereby slowing  the
+convergence.  In   this  paper,  we   based  on  the   work  presented
+in~\cite{huang1993krylov} to increase  the convergence and improve the
+scalability of the multisplitting methods.
+
+In  order  to  accelerate  the  convergence, we  implement  the  outer
+iteration  of the multisplitting  solver as  a Krylov  subspace method
+which minimizes some error function over a Krylov subspace~\cite{S96}.
+The Krylov  space of  the method that  we used  is spanned by  a basis
+composed  of   successive  solutions  issued  from   solving  the  $L$
+splittings~(\ref{sec03:eq03})
  \begin{equation}
-\{x^1,x^2,\ldots,x^s\},~s\ll n,
+S=\{x^1,x^2,\ldots,x^s\},~s\leq n,
  \label{sec03:eq04}
  \end{equation}
-where for $k\in\{1,\ldots,s\}$, $x^k=[X_1^k,\ldots,X_L^k]$ is a solution of the global linear
-system. 
-%The advantage such a method is that the Krylov subspace does not need to be spanned by an orthogonal basis.
-The advantage of such a Krylov subspace is that we need neither an orthogonal basis nor synchronizations
-between the different clusters to generate this basis. 
-
-
+where   for  $k\in\{1,\ldots,s\}$,  $x^k=[X_1^k,\ldots,X_L^k]$   is  a
+solution of the  global linear system.%The advantage such a method is that the Krylov subspace does not need to be spanned by an orthogonal basis.
+The advantage  of such a  Krylov subspace is  that we need  neither an
+orthogonal basis  nor synchronizations between  the different clusters
+to generate this basis.
+
+The multisplitting method is periodically restarted every $s$ iterations
+with a new initial guess $\tilde{x}=S\alpha$ which minimizes the error
+function $\|b-Ax\|_2$ over the Krylov subspace spanned by the vectors of $S$.
+So, $\alpha$ is defined as the solution of the large overdetermined linear system
+\begin{equation}
+B\alpha=b,
+\label{sec03:eq05}
+\end{equation}
+where $B=AS$ is a dense rectangular matrix of size $n\times s$ and $s\ll n$. This leads
+us to solve the system of normal equations 
+\begin{equation}
+B^TB\alpha=B^Tb,
+\label{sec03:eq06}
+\end{equation}
+which is associated with the least squares problem
+\begin{equation}
+\text{minimize}~\|b-B\alpha\|_2,
+\label{sec03:eq07}
+\end{equation}  
+where $B^T$ denotes the transpose of the matrix $B$. Since $B$ (i.e. $AS$) and
+$b$ are split among $L$ clusters, the symmetric positive definite system~(\ref{sec03:eq06})
+is solved in parallel. Thus, an iterative method would be more appropriate than
+a direct one for solving this system. We use the parallel conjugate gradient 
+method for the normal equations CGNR~\cite{S96,refCGNR}.
+
+%%% Ecrire l'algorithme(s)
+%%% Parler des synchronisations entre proc et clusters