+{\bf example ex45/ksp à décrire et commenter en montrant que hypre est pourri avec cet exemple}\r
+\begin{tabular}{|r|r|r|r|r|r|r|r|} \r
+ nb. cores & \multicolumn{2}{c|}{FGMRES/ASM} & \multicolumn{2}{c|}{TSIRM CGLS/ASM} & gain& \multicolumn{2}{c|}{FGMRES/HYPRE} \\ \r
+\cline{2-5} \cline{7-8}\r
+ & Time & \# Iter. & Time & \# Iter. & & Time & \# Iter. \\\hline \hline\r
+ 512 & 5.54 & 685 & 2.5 & 570 & 2.21 & 128.9 & 9 \\\r
+ 2048 & 14.95 & 1,560 & 4.32 & 746 & 3.48 & 335.7 & 9 \\\r
+ 4096 & 25.13 & 2,369 & 5.61 & 859 & 4.48 & >1000 & -- \\\r
+ 8192 & 44.35 & 3,197 & 7.6 & 1083 & 5.84 & >1000 & -- \\\r
+\caption{Comparison of FGMRES and TSIRM for ex45 of PETSc/KSP with two preconditioner (ASM and HYPRE) having 5,000 components per core on Curie ($\epsilon_{tsirm}=1e-10$, $max\_iter_{kryl}=30$, $s=12$, $max\_iter_{ls}=15$,$\epsilon_{ls}=1e-40$), time is expressed in seconds.}\r
\subsection{Parallel nonlinear problems}\r
With PETSc, linear solvers are used inside nonlinear solvers. The SNES\r
ex20. In ex14, the code solves the Bratu (SFI - solid fuel ignition) nonlinear\r
partial difference equations in 3 dimension. In ex20, the code solves a 3\r
dimension radiative transport test problem. For more details on these examples,\r
-interested readers are invited to see the code in the PETSc examples.\r
-In Table~\ref{tab:07} we report the result of our experiments for the example\r
-ex14. \r
+interested readers are invited to see the code in the PETSc examples. For both\r
+these examples, a weak scaling case is chosen where processors have\r
+approximately a number of components equals to 100,000.\r
+In Table~\ref{tab:07} we report the result of our experiments for the example\r
+ex14 with the block Jacobi preconditioner. For TSIRM the CGLS algorithm is used\r
+to solve the minimization step. In this table, we can see that the number of\r
+iterations used by the linear solver is smaller with TSIRM compared with FGMRES.\r
+Consequently the execution times are smaller with TSIRM. The gain between TSIRM\r
+and FGMRES is around 6 and 7. The parameters of TSIRM are expressed in the\r
+caption of the table.\r
nb. cores & \multicolumn{2}{c|}{FGMRES/BJAC} & \multicolumn{2}{c|}{TSIRM CGLS/BJAC} & gain \\ \r
& Time & \# Iter. & Time & \# Iter. & \\\hline \hline\r
- 1024 & 159.52 & 11,584 & 26.34 & 1,563 & 6.06 \\\r
- 2048 & 226.24 & 16,459 & 37.23 & 2,248 & 6.08\\\r
- 4096 & 391.21 & 27,794 & 50.93 & 2,911 & 7.69\\\r
- 8192 & 543.23 & 37,770 & 79.21 & 4,324 & 6.86 \\\r
+ 1,024 & 159.52 & 11,584 & 26.34 & 1,563 & 6.06 \\\r
+ 2,048 & 226.24 & 16,459 & 37.23 & 2,248 & 6.08\\\r
+ 4,096 & 391.21 & 27,794 & 50.93 & 2,911 & 7.69\\\r
+ 8,192 & 543.23 & 37,770 & 79.21 & 4,324 & 6.86 \\\r
+In Table~\cite{tab:08}, the results of the experiments with the example ex20 are\r
+reported. The block Jacobi preconditioner has also been used and CGLS to solve\r
+the minimization step for TSIRM. For this example, we can observ that the number\r
+of iterations for FMGRES increase drastically when the number of cores\r
+increases. With TSIRM, we can see that the number of iterations is initially\r
+very small compared to the FGMRES ones and when the number of cores increase,\r
+the number of iterations increases slighther with TSIRM than with FGMRES. For\r
+this example, the gain between TSIRM and FGMRES ranges between 8 with 1,024\r
+cores to more than 16 with 8,192 cores.\r
nb. cores & \multicolumn{2}{c|}{FGMRES/BJAC} & \multicolumn{2}{c|}{TSIRM CGLS/BJAC} & gain \\ \r
& Time & \# Iter. & Time & \# Iter. & \\\hline \hline\r
- 1024 & 667.92 & 48,732 & 81.65 & 5,087 & 8.18 \\\r
- 2048 & 966.87 & 77,177 & 90.34 & 5,716 & 10.70\\\r
- 4096 & 1,742.31 & 124,411 & 119.21 & 6,905 & 14.61\\\r
- 8192 & 2,739.21 & 187,626 & 168.9 & 9,000 & 16.22\\\r
+ 1,024 & 667.92 & 48,732 & 81.65 & 5,087 & 8.18 \\\r
+ 2,048 & 966.87 & 77,177 & 90.34 & 5,716 & 10.70\\\r
+ 4,096 & 1,742.31 & 124,411 & 119.21 & 6,905 & 14.61\\\r
+ 8,192 & 2,739.21 & 187,626 & 168.9 & 9,000 & 16.22\\\r
\subsection{Influence of parameters for TSIRM}\r
+In this section we present some experimental results in order to study the influence of some parameters on the TSIRM algorithm. We conducted experiments on $16$ cores to solve 3D problems of size $200,000$ components per core. We solved nonlinear problems token from examples of PETSc. We fixed some parameters of the TSIRM algorithm as follows: the nonlinear systems are solved with a precision of $10^{-8}$, block Jacobi preconditioner is used, the tolerance threshold $\epsilon_{tsirm}$ is $10^{-8}$ , the maximum number of iterations $max\_iter_{tsirm}$ is set to $10,000$ iterations, the FGMRES method is used as the inner solver with a tolerance threshold $\epsilon_{kryl}=10^{-10}$ and the least-squares problem is solved with a precision $\epsilon_{ls}=10^{-40}$ in the minimization process.\r
+%time mpirun ../ex48 -da_grid_x 147 -da_grid_y 147 -da_grid_z 147 -snes_rtol 1.e-8 -snes_monitor -ksp_type tsirm -ksp_pc_type bjacobi -pc_type ksp -ksp_tsirm_tol 1e-8 -ksp_tsirm_maxiter 10000 -ksp_ksp_type fgmres -ksp_tsirm_max_inner_iter 30 -ksp_tsirm_inner_restarts 30 -ksp_tsirm_inner_tol 1e-10 -ksp_tsirm_cgls 0 -ksp_tsirm_tol_ls 1.e-40 -ksp_tsirm_maxiter_ls 15 -ksp_tsirm_size_ls 10 \r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_cgls_iter_total}\r
+\caption{Number of total iterations using two different methods for the minimization: LSQR and CGLS.}\r
+\label{fig:cgls-iter} \r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_cgls_time}\r
+\caption{Execution time in seconds using two different methods for the minimization: LSQR and CGLS.}\r
+\label{fig:cgls-time} \r
+%time mpirun ../ex35 -da_grid_x 147 -da_grid_y 147 -da_grid_z 147 -snes_rtol 1.e-8 -snes_monitor -ksp_type tsirm -ksp_pc_type bjacobi -pc_type ksp -ksp_tsirm_tol 1e-8 -ksp_tsirm_maxiter 10000 -ksp_ksp_type fgmres -ksp_tsirm_max_inner_iter 30 -ksp_tsirm_inner_restarts 38 -ksp_tsirm_inner_tol 1e-10 -ksp_tsirm_cgls 0 -ksp_tsirm_tol_ls 1.e-40 -ksp_tsirm_maxiter_ls 15 -ksp_tsirm_size_ls 10\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_inner_restarts_iter_total}\r
+\caption{Number of total iterations with variation of restarts in the inner solver FGMRES.}\r
+\label{fig:inner_restarts_iter_total} \r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_inner_restarts_time}\r
+\caption{Execution time in seconds with variation of restarts in the inner solver FGMRES.}\r
+\label{fig:inner_restarts_time} \r
+%time mpirun ../ex14 -da_grid_x 147 -da_grid_y 147 -da_grid_z 147 -snes_rtol 1.e-8 -snes_monitor -ksp_type tsirm -ksp_pc_type bjacobi -pc_type ksp -ksp_tsirm_tol 1e-8 -ksp_tsirm_maxiter 10000 -ksp_ksp_type fgmres -ksp_tsirm_max_inner_iter 1000 -ksp_tsirm_inner_restarts 30 -ksp_tsirm_inner_tol 1e-10 -ksp_tsirm_cgls 0 -ksp_tsirm_tol_ls 1.e-40 -ksp_tsirm_maxiter_ls 15 -ksp_tsirm_size_ls 10\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_max_inner_iter}\r
+\caption{Number of total iterations with variation of number of inner iterations.}\r
+\label{fig:max_inner_iter} \r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_max_inner_time}\r
+\caption{Execution time in seconds with variation of number of inner iterations.}\r
+\label{fig:max_inner_time} \r
+%time mpirun ../ex14 -da_grid_x 147 -da_grid_y 147 -da_grid_z 147 -snes_rtol 1.e-8 -snes_monitor -ksp_type tsirm -ksp_pc_type bjacobi -pc_type ksp -ksp_tsirm_tol 1e-8 -ksp_tsirm_maxiter 10000 -ksp_ksp_type fgmres -ksp_tsirm_max_inner_iter 30 -ksp_tsirm_inner_restarts 30 -ksp_tsirm_inner_tol 1e-10 -ksp_tsirm_cgls 0 -ksp_tsirm_tol_ls 1.e-40 -ksp_tsirm_maxiter_ls 5 -ksp_tsirm_size_ls 10\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_maxiter_ls_iter}\r
+\caption{Number of total iterations with variation of number of iterations in the minimization process.}\r
+\label{fig:maxiter_ls_iter} \r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_maxiter_ls_time}\r
+\caption{Execution time in seconds with variation of number of iterations in the minimization process.}\r
+\label{fig:maxiter_ls_time} \r
+%time mpirun ../ex14 -da_grid_x 147 -da_grid_y 147 -da_grid_z 147 -snes_rtol 1.e-8 -snes_monitor -ksp_type tsirm -ksp_pc_type bjacobi -pc_type ksp -ksp_tsirm_tol 1e-8 -ksp_tsirm_maxiter 10000 -ksp_ksp_type fgmres -ksp_tsirm_max_inner_iter 30 -ksp_tsirm_inner_restarts 30 -ksp_tsirm_inner_tol 1e-10 -ksp_tsirm_cgls 0 -ksp_tsirm_tol_ls 1.e-40 -ksp_tsirm_maxiter_ls 15 -ksp_tsirm_size_ls 2\r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_size_ls_iter}\r
+\caption{Number of total iterations with variation of the size of the least-squares problem in the minimization process.}\r
+\label{fig:size_ls_iter} \r
+ \includegraphics[angle=-90,width=0.5\textwidth]{ksp_tsirm_size_ls_time}\r
+\caption{Execution time in seconds with variation of the size of the least-squares problem in the minimization process.}\r
+\label{fig:size_ls_time} \r