\usepackage{multirow}\r
\usepackage{graphicx}\r
\usepackage{url}\r
+\usepackage{dsfont}\r
+\r
\r
\def\newblock{\hskip .11em plus .33em minus .07em}\r
\r
\section{Convergence results}\r
\label{sec:04}\r
\r
+%%NEW\r
+\r
+\r
+We suppose in this section that GMRES($m$) is used as solver in the TSIRM algorithm applied on a complex matrix $A$.\r
+Let us denote $A^\ast$ the conjugate transpose of $A$, and let $\mathfrak{R}(A)=\dfrac{1}{2} \left( A + A^\ast\right)$, $\mathfrak{I}(A)=\dfrac{1}{2i} \left( A - A^\ast\right)$. \r
+\r
+\subsection{$\mathfrak{R}(A)$ is positive}\r
+\r
+\begin{proposition}\r
+\label{positiveConvergent}\r
+If $\mathfrak{R}(A)$ is positive, then the TSIRM algorithm is convergent.\r
+\end{proposition}\r
+\r
+\r
+\begin{proof}\r
+If $\mathfrak{R}(A)$ is positive, then even if $A$ is complex, it is possible to state that \r
+the GMRES algorithm is convergent, see, \emph{e.g.},~\cite{Huang89}. In particular, its residual norm\r
+decreases to zero.\r
+\r
+At each iterate of the TSIRM algorithm, either a GMRES iteration is realized or a least square\r
+resolution (to find the minimum of $||b-Ax||_2$ is achieved on the linear span of the iterated approximation vectors \r
+$span\left(x_{k-s+1}, x_{k-s}+2, \hdots, x_{k} \right)$\r
+of the last GMRES stage,\r
+where\r
+$\operatorname{span}(S) = \left \{ {\sum_{i=1}^k \lambda_i v_i \Big| k \in \mathbb{N}, v_i \in S, \lambda _i \in \mathbb{R}} \right \}$.\r
+\r
+Obviously, the minimum of $||b-Ax||_2$ on the set $span\left(x_{k-s+1}, x_{k-s}+2, \hdots, x_{k} \right)$ \r
+is lower than or equal to $||b-Ax_k||_2$, which is the last obtained GMRES-residual norm. So we can\r
+conclude that the intermediate stage of least square resolution inserted into the GMRES algorithm\r
+does not break the decreasing to zero of the GMRES-residual norm. \r
+\r
+In other words, the TSIRM algorithm is convergent.\r
+\end{proof}\r
+\r
\r
-We can now claim that,\r
+Regarding the convergence speed, we can claim that,\r
\begin{proposition}\r
\label{prop:saad}\r
-If $A$ is either a definite positive or a positive matrix and GMRES($m$) is used as a solver, then the TSIRM algorithm is convergent. \r
+If $A$ is a positive matrix, then the convergence of the \r
+TSIRM algorithm is linear. \r
\r
-Furthermore, let $r_k$ be the\r
-$k$-th residue of TSIRM, then\r
+Furthermore, let $r_k$ be the $k$-th residue of TSIRM, then\r
we have the following boundaries:\r
-\begin{itemize}\r
-\item when $A$ is positive:\r
\begin{equation}\r
||r_k|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0|| ,\r
\end{equation}\r
-where $M$ is the symmetric part of $A$, $\alpha = \lambda_{min}(M)^2$ and $\beta = \lambda_{max}(A^T A)$;\r
-\item when $A$ is positive definite:\r
-\begin{equation}\r
-\|r_k\| \leq \left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{km/2} \|r_0\|.\r
-\end{equation}\r
-\end{itemize}\r
-%In the general case, where A is not positive definite, we have\r
-%$\|r_n\| \le \inf_{p \in P_n} \|p(A)\| \le \kappa_2(V) \inf_{p \in P_n} \max_{\lambda \in \sigma(A)} |p(\lambda)| \|r_0\|, .$\r
+where $M$ is the symmetric part of $A$, $\alpha = \lambda_{min}(M)^2$ and $\beta = \lambda_{max}(A^T A)$.\r
\end{proposition}\r
\r
\begin{proof}\r
-Let us first recall that the residue is under control when considering the GMRES algorithm on a positive definite matrix, and it is bounded as follows:\r
-\begin{equation*}\r
-\|r_k\| \leq \left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{k/2} \|r_0\| .\r
-\end{equation*}\r
-Additionally, when $A$ is a positive real matrix with symmetric part $M$, then the residual norm provided at the $m$-th step of GMRES satisfies:\r
+Let us first recall that, when $A$ is a positive real matrix with symmetric part $M$, then the residual norm provided at the $m$-th step of GMRES satisfies:\r
\begin{equation*}\r
||r_m|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{m}{2}} ||r_0|| ,\r
\end{equation*}\r
-where $\alpha$ and $\beta$ are defined as in Proposition~\ref{prop:saad}, which proves \r
-the convergence of GMRES($m$) for all $m$ under such assumptions regarding $A$.\r
+where $\alpha$ and $\beta$ are defined as in Proposition~\ref{prop:saad}.\r
These well-known results can be found, \emph{e.g.}, in~\cite{Saad86}.\r
\r
We will now prove by a mathematical induction that, for each $k \in \mathbb{N}^\ast$, \r
-$||r_k|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{mk}{2}} ||r_0||$ when $A$ is positive, and $\|r_k\| \leq \left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{km/2} \|r_0\|$ when $A$ is positive definite.\r
+$||r_k|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{mk}{2}} ||r_0||$ when $A$ is positive.\r
\r
The base case is obvious, as for $k=1$, the TSIRM algorithm simply consists in applying GMRES($m$) once, leading to a new residual $r_1$ that follows the inductive hypothesis due to the results recalled above.\r
\r
-Suppose now that the claim holds for all $m=1, 2, \hdots, k-1$, that is, $\forall m \in \{1,2,\hdots, k-1\}$, $||r_m|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||$ in the positive case, and $\|r_k\| \leq \left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{km/2} \|r_0\|$ in the definite positive one.\r
+Suppose now that the claim holds for all $m=1, 2, \hdots, k-1$, that is, $\forall m \in \{1,2,\hdots, k-1\}$, $||r_m|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||$.\r
We will show that the statement holds too for $r_k$. Two situations can occur:\r
\begin{itemize}\r
-\item If $k \not\equiv 0 ~(\textrm{mod}\ m)$, then the TSIRM algorithm consists in executing GMRES once. In that case and by using the inductive hypothesis, we obtain either $||r_k|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{m}{2}} ||r_{k-1}||\leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||$ if $A$ is positive, or $\|r_k\| \leqslant \left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{m/2} \|r_{k-1}\|$ $\leqslant$ $\left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{km/2} \|r_{0}\|$ in the positive definite case.\r
+\item If $k \not\equiv 0 ~(\textrm{mod}\ m)$, then the TSIRM algorithm consists in executing GMRES once. In that case and by using the inductive hypothesis, we obtain either $||r_k|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{m}{2}} ||r_{k-1}||\leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||$.\r
\item Else, the TSIRM algorithm consists in two stages: a first GMRES($m$) execution leads to a temporary $x_k$ whose residue satisfies:\r
-\begin{itemize}\r
-\item $||r_k|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{m}{2}} ||r_{k-1}||\leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||$ in the positive case, \r
-\item $\|r_k\| \leqslant \left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{m/2} \|r_{k-1}\|$ $\leqslant$ $\left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{km/2} \|r_{0}\|$ in the positive definite one,\r
-\end{itemize}\r
+$$||r_k|| \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{m}{2}} ||r_{k-1}||\leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||$$\r
and a least squares resolution.\r
Let $\operatorname{span}(S) = \left \{ {\sum_{i=1}^k \lambda_i v_i \Big| k \in \mathbb{N}, v_i \in S, \lambda _i \in \mathbb{R}} \right \}$ be the linear span of a set of real vectors $S$. So,\\\r
$\min_{\alpha \in \mathbb{R}^s} ||b-R\alpha ||_2 = \min_{\alpha \in \mathbb{R}^s} ||b-AS\alpha ||_2$\r
& \leqslant \min_{\lambda \in \mathbb{R}} ||b-\lambda Ax_{k} ||_2\\\r
& \leqslant ||b-Ax_{k}||_2\\\r
& = ||r_k||_2\\\r
-& \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||, \textrm{ if $A$ is positive,}\\\r
-& \leqslant \left( 1-\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{km/2} \|r_{0}\|, \textrm{ if $A$ is}\\\r
-& \textrm{positive definite,} \r
+& \leqslant \left(1-\dfrac{\alpha}{\beta}\right)^{\frac{km}{2}} ||r_0||, \\\r
\end{array}$\r
\end{itemize}\r
which concludes the induction and the proof.\r
\end{proof}\r
\r
+\r
+\r
+\subsection{$\mathfrak{R}(A)$ is positive definite}\r
+\r
+\begin{proposition}\r
+\label{prop2}\r
+Convergence of the TSIRM algorithm is at least linear when $\mathfrak{R}(A)$ is \r
+positive definite. Furthermore, the rate of convergence is lower \r
+than $$\min\left( \left(1- \dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{ \lambda_{min}^{\mathfrak{R}(A)} \lambda_{max}^{\mathfrak{R}(A)} + {\lambda_{max}^{\mathfrak{I}(A)}}^2}\right)^{\frac{m}{2}}; \r
+\left(1-\dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{||A||^2}\right)^{\frac{m}{2}}\right) ,$$\r
+where ${\lambda_{min}^{X}}$ (resp. ${\lambda_{max}^{X}}$) is the lowest (resp. largest) eigenvalue of matrix $X$.\r
+\end{proposition}\r
+\r
+\r
+\begin{proof}\r
+If $\mathfrak{R}(A)$ is positive definite, then it is positive, and so the TSIRM algorithm\r
+is convergent due to Proposition~\ref{positiveConvergent}.\r
+\r
+Furthermore, as stated in the proof of Proposition~\ref{positiveConvergent}, the GMRES residue is under control \r
+when $\mathfrak{R}(A)$ is positive. More precisely, it has been proven in the literature that the residual norm \r
+provided at the $m$-th step of GMRES satisfies:\r
+\begin{enumerate}\r
+\item $||r_m|| \leqslant \left(1- \dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{ \lambda_{min}^{\mathfrak{R}(A)} \lambda_{max}^{\mathfrak{R}(A)} + {\lambda_{max}^{\mathfrak{I}(A)}}^2}\right)^{\frac{mk}{2}} ||r_0||$, see, \emph{e.g.},~\cite{citeulike:2951999}, \r
+\item $||r_m|| \leqslant \left(1-\dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{||A||^2}\right)^{\frac{mk}{2}} ||r_0||$, see~\cite{ANU:137201},\r
+\end{enumerate}\r
+which proves the convergence of GMRES($m$) for all $m$ under such assumptions regarding $A$.\r
+\r
+We will now prove by a mathematical induction, and following the same canvas than in the proof of Prop.~\ref{positiveConvergent}, that: for each $k \in \mathbb{N}^\ast$, the TSIRM-residual norm satisfies\r
+\begin{equation}\r
+\label{induc}\r
+\begin{array}{ll}\r
+||r_k|| \leqslant & \min\left( \left(1- \dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{ \lambda_{min}^{\mathfrak{R}(A)} \lambda_{max}^{\mathfrak{R}(A)} + {\lambda_{max}^{\mathfrak{I}(A)}}^2}\right)^{\frac{m}{2}}; \right. \\\r
+& \left. \left(1-\dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{||A||^2}\right)^{\frac{m}{2}}\right) ||r_0||\r
+\end{array}\r
+\end{equation}\r
+when $A$ is positive definite.\r
+\r
+\r
+The base case is obvious, as for $k=1$, the TSIRM algorithm simply consists in applying GMRES($m$) once, leading to a new residual $r_1$ that follows the inductive hypothesis due to the results recalled in the items listed above.\r
+\r
+Suppose now that the claim holds for all $u=1, 2, \hdots, k-1$, that is, $\forall u \in \{1,2,\hdots, k-1\}$, $||r_u|| \leqslant \left(1-\dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{||A||^2}\right)^{\frac{mu}{2}} ||r_0||$.\r
+We will show that the statement holds too for $r_k$. Two situations can occur:\r
+\begin{itemize}\r
+\item If $k \not\equiv 0 ~(\textrm{mod}\ m)$, then the TSIRM algorithm consists in executing GMRES once. In that case and by using the inductive hypothesis, we obtain \r
+$||r_k|| \leqslant \left(1- \dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{ \lambda_{min}^{\mathfrak{R}(A)} \lambda_{max}^{\mathfrak{R}(A)} + {\lambda_{max}^{\mathfrak{I}(A)}}^2}\right)^{\frac{m}{2}} \leqslant \left(1- \dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{ \lambda_{min}^{\mathfrak{R}(A)} \lambda_{max}^{\mathfrak{R}(A)} + {\lambda_{max}^{\mathfrak{I}(A)}}^2}\right)^{\frac{mk}{2}} ||r_0||$, due to~\cite{citeulike:2951999}. Furthermore, we have too that: $||r_k|| \leqslant \left(1-\dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{||A||^2}\right)^{\frac{m}{2}} ||r_{k-1}|| \leqslant \left(1-\dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{||A||^2}\right)^{\frac{mk}{2}} ||r_0||$, as proven in~\cite{ANU:137201} and by using the inductive hypothesis. So we can conclude that \r
+$$\begin{array}{ll}||r_k|| \leqslant & \min\left( \left(1- \dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{ \lambda_{min}^{\mathfrak{R}(A)} \lambda_{max}^{\mathfrak{R}(A)} + {\lambda_{max}^{\mathfrak{I}(A)}}^2}\right)^{\frac{mk}{2}}; \right. \\\r
+& \left. \left(1-\dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{||A||^2}\right)^{\frac{mk}{2}}\right) \times ||r_0|| \r
+\end{array}.$$\r
+\r
+\item Else, the TSIRM algorithm consists in two stages: a first GMRES($m$) execution leads to a temporary $x_k$ whose residue satisfies, following the previous item:\r
+$$\begin{array}{ll}\r
+||r_k|| & \leqslant \min\left( \left(1- \dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{ \lambda_{min}^{\mathfrak{R}(A)} \lambda_{max}^{\mathfrak{R}(A)} + {\lambda_{max}^{\mathfrak{I}(A)}}^2}\right)^{\frac{m}{2}}; \right. \\\r
+& \left. \left(1-\dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{||A||^2}\right)^{\frac{m}{2}}\right) \times ||r_{k-1}||\\\r
+ & \leqslant \min\left( \left(1- \dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{ \lambda_{min}^{\mathfrak{R}(A)} \lambda_{max}^{\mathfrak{R}(A)} + {\lambda_{max}^{\mathfrak{I}(A)}}^2}\right)^{\frac{mk}{2}}; \right. \\\r
+& \left. \left(1-\dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{||A||^2}\right)^{\frac{mk}{2}}\right) \times ||r_0|| \r
+\end{array}$$\r
+and the least squares resolution of $\min_{\alpha \in \mathbb{R}^s} ||b-R\alpha ||_2$.\r
+\r
+Let $\operatorname{span}(S) = \left \{ {\sum_{i=1}^k \lambda_i v_i \Big| k \in \mathbb{N}, v_i \in S, \lambda _i \in \mathbb{R}} \right \}$ be the linear span of a set of real vectors $S$, as defined previously. So,\\\r
+$\min_{\alpha \in \mathbb{R}^s} ||b-R\alpha ||_2 = \min_{\alpha \in \mathbb{R}^s} ||b-AS\alpha ||_2$\r
+\r
+$\begin{array}{ll}\r
+& = \min_{x \in span\left(S_{k-s+1}, S_{k-s+2}, \hdots, S_{k} \right)} ||b-AS\alpha ||_2\\\r
+& = \min_{x \in span\left(x_{k-s+1}, x_{k-s}+2, \hdots, x_{k} \right)} ||b-AS\alpha ||_2\\\r
+& \leqslant \min_{x \in span\left( x_{k} \right)} ||b-Ax ||_2\\\r
+& \leqslant \min_{\lambda \in \mathbb{R}} ||b-\lambda Ax_{k} ||_2\\\r
+& \leqslant ||b-Ax_{k}||_2\\\r
+& = ||r_k||_2\\\r
+& \leqslant \min\left( \left(1- \dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{ \lambda_{min}^{\mathfrak{R}(A)} \lambda_{max}^{\mathfrak{R}(A)} + {\lambda_{max}^{\mathfrak{I}(A)}}^2}\right)^{\frac{mk}{2}}; \right. \\\r
+& \left. \left(1-\dfrac{{\lambda_{min}^{\mathfrak{R}(A)}}^2}{||A||^2}\right)^{\frac{mk}{2}}\right) \times ||r_0|| \r
+\end{array} .$\r
+\end{itemize}\r
+due to the inductive hypothesis. \r
+So the statement of Equation~\eqref{induc} holds too for the $k$-th iterate, which concludes the induction and the proof.\r
+\end{proof}\r
+\r
+\subsection{A last linear convergence}\r
+\r
+\r
+\begin{proposition}\r
+Let us define the field of values of $A$ by \r
+$$\mathfrak{F}(A) = \left\{ \dfrac{x^\ast A x}{x^\ast x}, x \in \mathds{C}^n\setminus \{0\} \right\} .$$\r
+\r
+Then if $\mathfrak{F}(A)$ is included into a closed ball of radius $r$ and center $c$,\r
+which does not contain the origin, then the convergence of the TSIRM algorithm is at least linear. \r
+\r
+More precisely, the rate of convergence is lower \r
+than $2 \dfrac{r}{|c|}$.\r
+\end{proposition}\r
+\r
+\begin{proof}\r
+This inequality comes from the fact that, in the conditions of the proposition, the GMRES residue \r
+satisfies the inequality: $|r_k| \leqslant 2 \dfrac{r}{|c|}^k |r_0|$. An induction inspired by\r
+the proofs of Propositions~\ref{prop:saad} and~\ref{prop2} can transfer this inequality to the\r
+TSIRM residue.\r
+\end{proof}\r
+\r
+\r
+\r
Remark that a similar proposition can be formulated at each time\r
the given solver satisfies an inequality of the form $||r_n|| \leqslant \mu^n ||r_0||$,\r
with $|\mu|<1$. Furthermore, it is \emph{a priori} possible in some particular cases \r
regarding $A$, \r
that the proposed TSIRM converges while the GMRES($m$) does not.\r
\r
+%%ENDNEW\r
+\r
+\r
%%%*********************************************************\r
%%%*********************************************************\r
\section{Experiments using PETSc}\r
\r
%%NEW\r
\r
+{\bf example ex45/ksp à décrire et commenter en montrant que hypre est pourri avec cet exemple}\r
+\r
+\begin{table*}[htbp]\r
+\begin{center}\r
+\begin{tabular}{|r|r|r|r|r|r|r|r|} \r
+\hline\r
+\r
+ nb. cores & \multicolumn{2}{c|}{FGMRES/ASM} & \multicolumn{2}{c|}{TSIRM CGLS/ASM} & gain& \multicolumn{2}{c|}{FGMRES/HYPRE} \\ \r
+\cline{2-5} \cline{7-8}\r
+ & Time & \# Iter. & Time & \# Iter. & & Time & \# Iter. \\\hline \hline\r
+ 512 & 5.54 & 685 & 2.5 & 570 & 2.21 & 128.9 & 9 \\\r
+ 2048 & 14.95 & 1,560 & 4.32 & 746 & 3.48 & 335.7 & 9 \\\r
+ 4096 & 25.13 & 2,369 & 5.61 & 859 & 4.48 & >1000 & -- \\\r
+ 8192 & 44.35 & 3,197 & 7.6 & 1083 & 5.84 & >1000 & -- \\\r
+\r
+\hline\r
+\r
+\end{tabular}\r
+\caption{Comparison of FGMRES and TSIRM for ex45 of PETSc/KSP with two preconditioner (ASM and HYPRE) having 5,000 components per core on Curie ($\epsilon_{tsirm}=1e-10$, $max\_iter_{kryl}=30$, $s=12$, $max\_iter_{ls}=15$,$\epsilon_{ls}=1e-40$), time is expressed in seconds.}\r
+\label{tab:06}\r
+\end{center}\r
+\end{table*}\r
+\r
+\r
\subsection{Parallel nonlinear problems}\r
\r
With PETSc, linear solvers are used inside nonlinear solvers. The SNES\r
ex20. In ex14, the code solves the Bratu (SFI - solid fuel ignition) nonlinear\r
partial difference equations in 3 dimension. In ex20, the code solves a 3\r
dimension radiative transport test problem. For more details on these examples,\r
-interested readers are invited to see the code in the PETSc examples.\r
-\r
-In Table~\ref{tab:07} we report the result of our experiments for the example\r
-ex14. \r
+interested readers are invited to see the code in the PETSc examples. For both\r
+these examples, a weak scaling case is chosen where processors have\r
+approximately a number of components equals to 100,000.\r
+\r
+In Table~\ref{tab:07} we report the result of our experiments for the example\r
+ex14 with the block Jacobi preconditioner. For TSIRM the CGLS algorithm is used\r
+to solve the minimization step. In this table, we can see that the number of\r
+iterations used by the linear solver is smaller with TSIRM compared with FGMRES.\r
+Consequently the execution times are smaller with TSIRM. The gain between TSIRM\r
+and FGMRES is around 6 and 7. The parameters of TSIRM are expressed in the\r
+caption of the table.\r
\r
\begin{table*}[htbp]\r
\begin{center}\r
nb. cores & \multicolumn{2}{c|}{FGMRES/BJAC} & \multicolumn{2}{c|}{TSIRM CGLS/BJAC} & gain \\ \r
\cline{2-5}\r
& Time & \# Iter. & Time & \# Iter. & \\\hline \hline\r
- 1024 & 159.52 & 11,584 & 26.34 & 1,563 & 6.06 \\\r
- 2048 & 226.24 & 16,459 & 37.23 & 2,248 & 6.08\\\r
- 4096 & 391.21 & 27,794 & 50.93 & 2,911 & 7.69\\\r
- 8192 & 543.23 & 37,770 & 79.21 & 4,324 & 6.86 \\\r
+ 1,024 & 159.52 & 11,584 & 26.34 & 1,563 & 6.06 \\\r
+ 2,048 & 226.24 & 16,459 & 37.23 & 2,248 & 6.08\\\r
+ 4,096 & 391.21 & 27,794 & 50.93 & 2,911 & 7.69\\\r
+ 8,192 & 543.23 & 37,770 & 79.21 & 4,324 & 6.86 \\\r
\r
\hline\r
\r
\end{center}\r
\end{table*}\r
\r
+In Table~\cite{tab:08}, the results of the experiments with the example ex20 are\r
+reported. The block Jacobi preconditioner has also been used and CGLS to solve\r
+the minimization step for TSIRM. For this example, we can observ that the number\r
+of iterations for FMGRES increase drastically when the number of cores\r
+increases. With TSIRM, we can see that the number of iterations is initially\r
+very small compared to the FGMRES ones and when the number of cores increase,\r
+the number of iterations increases slighther with TSIRM than with FGMRES. For\r
+this example, the gain between TSIRM and FGMRES ranges between 8 with 1,024\r
+cores to more than 16 with 8,192 cores.\r
\r
\begin{table*}[htbp]\r
\begin{center}\r
nb. cores & \multicolumn{2}{c|}{FGMRES/BJAC} & \multicolumn{2}{c|}{TSIRM CGLS/BJAC} & gain \\ \r
\cline{2-5}\r
& Time & \# Iter. & Time & \# Iter. & \\\hline \hline\r
- 1024 & 667.92 & 48,732 & 81.65 & 5,087 & 8.18 \\\r
- 2048 & 966.87 & 77,177 & 90.34 & 5,716 & 10.70\\\r
- 4096 & 1,742.31 & 124,411 & 119.21 & 6,905 & 14.61\\\r
- 8192 & 2,739.21 & 187,626 & 168.9 & 9,000 & 16.22\\\r
+ 1,024 & 667.92 & 48,732 & 81.65 & 5,087 & 8.18 \\\r
+ 2,048 & 966.87 & 77,177 & 90.34 & 5,716 & 10.70\\\r
+ 4,096 & 1,742.31 & 124,411 & 119.21 & 6,905 & 14.61\\\r
+ 8,192 & 2,739.21 & 187,626 & 168.9 & 9,000 & 16.22\\\r
\r
\hline\r
\r
\r
\r
\r
+\r
+\r
+\r
+%%NEW\r
\subsection{Influence of parameters for TSIRM}\r
+In this section we present some experimental results in order to study the influence of some parameters on the TSIRM algorithm. We conducted experiments on $16$ cores to solve 3D problems of size $200,000$ components per core. We solved nonlinear problems token from examples of PETSc. We fixed some parameters of the TSIRM algorithm as follows: the nonlinear systems are solved with a precision of $10^{-8}$, block Jacobi preconditioner is used, the tolerance threshold $\epsilon_{tsirm}$ is $10^{-8}$ , the maximum number of iterations $max\_iter_{tsirm}$ is set to $10,000$ iterations, the FGMRES method is used as the inner solver with a tolerance threshold $\epsilon_{kryl}=10^{-10}$ and the least-squares problem is solved with a precision $\epsilon_{ls}=10^{-40}$ in the minimization process.\r
+\r
+%time mpirun ../ex48 -da_grid_x 147 -da_grid_y 147 -da_grid_z 147 -snes_rtol 1.e-8 -snes_monitor -ksp_type tsirm -ksp_pc_type bjacobi -pc_type ksp -ksp_tsirm_tol 1e-8 -ksp_tsirm_maxiter 10000 -ksp_ksp_type fgmres -ksp_tsirm_max_inner_iter 30 -ksp_tsirm_inner_restarts 30 -ksp_tsirm_inner_tol 1e-10 -ksp_tsirm_cgls 0 -ksp_tsirm_tol_ls 1.e-40 -ksp_tsirm_maxiter_ls 15 -ksp_tsirm_size_ls 10 \r
+\begin{figure}[htbp]\r
+\centering\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_cgls_iter_total}\r
+\caption{Number of total iterations using two different methods for the minimization: LSQR and CGLS.}\r
+\label{fig:cgls-iter} \r
+\end{figure}\r
+\r
+\begin{figure}[htbp]\r
+\centering\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_cgls_time}\r
+\caption{Execution time in seconds using two different methods for the minimization: LSQR and CGLS.}\r
+\label{fig:cgls-time} \r
+\end{figure}\r
+\r
+%time mpirun ../ex35 -da_grid_x 147 -da_grid_y 147 -da_grid_z 147 -snes_rtol 1.e-8 -snes_monitor -ksp_type tsirm -ksp_pc_type bjacobi -pc_type ksp -ksp_tsirm_tol 1e-8 -ksp_tsirm_maxiter 10000 -ksp_ksp_type fgmres -ksp_tsirm_max_inner_iter 30 -ksp_tsirm_inner_restarts 38 -ksp_tsirm_inner_tol 1e-10 -ksp_tsirm_cgls 0 -ksp_tsirm_tol_ls 1.e-40 -ksp_tsirm_maxiter_ls 15 -ksp_tsirm_size_ls 10\r
+\begin{figure}[htbp]\r
+\centering\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_inner_restarts_iter_total}\r
+\caption{Number of total iterations with variation of restarts in the inner solver FGMRES.}\r
+\label{fig:inner_restarts_iter_total} \r
+\end{figure}\r
+\r
+\begin{figure}[htbp]\r
+\centering\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_inner_restarts_time}\r
+\caption{Execution time in seconds with variation of restarts in the inner solver FGMRES.}\r
+\label{fig:inner_restarts_time} \r
+\end{figure}\r
+\r
+%time mpirun ../ex14 -da_grid_x 147 -da_grid_y 147 -da_grid_z 147 -snes_rtol 1.e-8 -snes_monitor -ksp_type tsirm -ksp_pc_type bjacobi -pc_type ksp -ksp_tsirm_tol 1e-8 -ksp_tsirm_maxiter 10000 -ksp_ksp_type fgmres -ksp_tsirm_max_inner_iter 1000 -ksp_tsirm_inner_restarts 30 -ksp_tsirm_inner_tol 1e-10 -ksp_tsirm_cgls 0 -ksp_tsirm_tol_ls 1.e-40 -ksp_tsirm_maxiter_ls 15 -ksp_tsirm_size_ls 10\r
+\begin{figure}[htbp]\r
+\centering\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_max_inner_iter}\r
+\caption{Number of total iterations with variation of number of inner iterations.}\r
+\label{fig:max_inner_iter} \r
+\end{figure}\r
+\r
+\begin{figure}[htbp]\r
+\centering\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_max_inner_time}\r
+\caption{Execution time in seconds with variation of number of inner iterations.}\r
+\label{fig:max_inner_time} \r
+\end{figure}\r
+\r
+%time mpirun ../ex14 -da_grid_x 147 -da_grid_y 147 -da_grid_z 147 -snes_rtol 1.e-8 -snes_monitor -ksp_type tsirm -ksp_pc_type bjacobi -pc_type ksp -ksp_tsirm_tol 1e-8 -ksp_tsirm_maxiter 10000 -ksp_ksp_type fgmres -ksp_tsirm_max_inner_iter 30 -ksp_tsirm_inner_restarts 30 -ksp_tsirm_inner_tol 1e-10 -ksp_tsirm_cgls 0 -ksp_tsirm_tol_ls 1.e-40 -ksp_tsirm_maxiter_ls 5 -ksp_tsirm_size_ls 10\r
+\begin{figure}[htbp]\r
+\centering\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_maxiter_ls_iter}\r
+\caption{Number of total iterations with variation of number of iterations in the minimization process.}\r
+\label{fig:maxiter_ls_iter} \r
+\end{figure}\r
+\r
+\begin{figure}[htbp]\r
+\centering\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_maxiter_ls_time}\r
+\caption{Execution time in seconds with variation of number of iterations in the minimization process.}\r
+\label{fig:maxiter_ls_time} \r
+\end{figure}\r
+\r
+%time mpirun ../ex14 -da_grid_x 147 -da_grid_y 147 -da_grid_z 147 -snes_rtol 1.e-8 -snes_monitor -ksp_type tsirm -ksp_pc_type bjacobi -pc_type ksp -ksp_tsirm_tol 1e-8 -ksp_tsirm_maxiter 10000 -ksp_ksp_type fgmres -ksp_tsirm_max_inner_iter 30 -ksp_tsirm_inner_restarts 30 -ksp_tsirm_inner_tol 1e-10 -ksp_tsirm_cgls 0 -ksp_tsirm_tol_ls 1.e-40 -ksp_tsirm_maxiter_ls 15 -ksp_tsirm_size_ls 2\r
+\begin{figure}[htbp]\r
+\centering\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_size_ls_iter}\r
+\caption{Number of total iterations with variation of the size of the least-squares problem in the minimization process.}\r
+\label{fig:size_ls_iter} \r
+\end{figure}\r
+\r
+\begin{figure}[htbp]\r
+\centering\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_size_ls_time}\r
+\caption{Execution time in seconds with variation of the size of the least-squares problem in the minimization process.}\r
+\label{fig:size_ls_time} \r
+\end{figure}\r
+\r
+%%ENDNEW\r
\r
\r
\r
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\r
\r
\bibliography{biblio}\r
-\bibliographystyle{unsrt}\r
-\bibliographystyle{alpha}\r
+\bibliographystyle{plain}\r
+%\bibliographystyle{alpha}\r
\r
\end{document}\r