From 22dbef42c491b27f1b290799ec5827db3be0bb45 Mon Sep 17 00:00:00 2001 From: raphael couturier Date: Fri, 25 Apr 2014 17:40:28 +0200 Subject: [PATCH] suite --- krylov_multi.tex | 56 ++++++++++++++++++++---------------------------- 1 file changed, 23 insertions(+), 33 deletions(-) diff --git a/krylov_multi.tex b/krylov_multi.tex index e51e232..a9d498f 100644 --- a/krylov_multi.tex +++ b/krylov_multi.tex @@ -272,23 +272,28 @@ it. In the following we presented some experiments we could achieved out on the Hector architecture, the previous UK's high-end computing resource, funded by the UK Research Councils, which has been stopped in the early 2014. -In the experiments we report the size of the 3D poisson considered\LZK[]{Suite\dots ?} - - -The first column shows the size of the problem The size is chosen in order to -have approximately 50,000 components per core. The second column represents the +Table~\ref{tab1} shows the result of the experiments. The first column shows +the size of the 3D Poisson problem. The size is chosen in order to have +approximately 50,000 components per core. The second column represents the number of cores used. In parenthesis, there is the decomposition used for the Krylov multisplitting. The third column and the sixth column respectively show the execution time for the GMRES and the Kyrlow multisplitting code. The fourth -and the seventh column describes the number of iterations. For the +and the seventh column describes the number of iterations. For the multisplitting code, the total number of inner iterations is represented in -parenthesis. +parenthesis. For the GMRES code (alone and in the multisplitting version) the +restart parameter is fixed to 16. The precision of the GMRES version is fixed to +1e-6. For the multisplitting, there are two precisions, one for the external +solver which is fixed to 1e-6 and another one for the inner solver (GMRES) which +is fixed to 1e-10. It should be noted that a high precision is used but we also +fixed a maximum number of iterations for each internal step. In practise, we +limit the number of internal step to 10. So an internal iteration is finished +when the precision is reached or when the maximum internal number of iterations +is reached. + - We also give the other parameters: the restart for the GRMES method....\\ -\LZK{La seule remarque que j'ai pu tirée des deux tableaux c'est le fait qu'il y a plus de procs dans un cluster pour 2x4096 et c'est pour cette configuration qu'on a un bon speedup avec préconditionnement!!! Mais je ne sais pas toujours pourquoi?} -\begin{table}[p] +\begin{table}[htbp] \begin{center} \begin{tabular}{|c|c||c|c|c||c|c|c||c|} \hline @@ -296,7 +301,8 @@ parenthesis. \cline{3-8} & & Time (s) & nb Iter. & $\Delta$ & Time (s)& nb Iter. & $\Delta$ & \\ \hline - +$468^3$ & 2048 (2x1024) & 299.7 & 41,028 & 5.02e-8 & 48.4 & 691(6,146) & 8.24e-08 & 6.19 \\ +\hline $590^3$ & 4096 (2x2048) & 433.1 & 55,494 & 4.92e-7 & 74.1 & 1,101(8,211) & 6.62e-08 & 5.84 \\ \hline $743^3$ & 8192 (2x4096) & 704.4 & 87,822 & 4.80e-07 & 151.2 & 3,061(14,914) & 5.87e-08 & 4.65 \\ @@ -305,33 +311,17 @@ $743^3$ & 8192 (4x2048) & 704.4 & 87,822 & 4.80e-07 & 110.3 & 1 \hline \end{tabular} -\caption{Results without preconditioner} +\caption{Results} \label{tab1} \end{center} \end{table} -\begin{table}[p] -\begin{center} -\begin{tabular}{|c|c||c|c|c||c|c|c||c|} -\hline -\multirow{2}{*}{Pb size}&\multirow{2}{*}{Nb. cores} & \multicolumn{3}{c||}{GMRES} & \multicolumn{3}{c||}{Krylov Multisplitting} & \multirow{2}{*}{Ratio}\\ - \cline{3-8} - & & Time (s) & nb Iter. & $\Delta$ & Time (s)& nb Iter. & $\Delta$ & \\ -\hline - -$590^3$ & 4096 (2x2048) & 433.0 & 55,494 & 4.92e-7 & 80.4 & 1,091(9,545) & 7.64e-08 & 5.39 \\ -\hline -$743^3$ & 8192 (2x4096) & 704.4 & 87,822 & 4.80e-07 & 110.2 & 1,401(12,379) & 1.11e-07 & 6.39 \\ -\hline -$743^3$ & 8192 (4x2048) & 704.4 & 87,822 & 4.80e-07 & 139.8 & 1,891(15,960) & 1.60e-07& 5.03 \\ -\hline - -\end{tabular} -\caption{Results with preconditioner} -\label{tab2} -\end{center} -\end{table} +From these experiments, it can be observed that the multisplitting version is +always faster than the GMRES version. The acceleration gain of the +multisplitting version is between 4 and 6. It can be noticed that the number of +iteration is drastically reduced with the multisplitting version even it is not +neglectable. \section{Conclusion and perspectives} -- 2.39.5