X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/GMRES2stage.git/blobdiff_plain/eacf4c2eeca7315f0a4a7bc9dec99dc6778843d1..649421a46874291325940af659c9eb66d83411b4:/paper.tex diff --git a/paper.tex b/paper.tex index 4d59b93..f4dba97 100644 --- a/paper.tex +++ b/paper.tex @@ -1035,8 +1035,40 @@ the number of iterations. So, the overall benefit of using TSIRM is interesting. \end{table*} -In Table~\ref{tab:04}, some experiments with example ex54 on the Curie architecture are reported. - +In Table~\ref{tab:04}, some experiments with example ex54 on the Curie +architecture are reported. For this application, we fixed $\alpha=0.6$. As it +can be seen in that Table, the size of the problem has a strong influence on the +number of iterations to reach the convergence. That is why we have preferred to +change the threshold. If we set it to $1e-3$ as with the previous application, +only one iteration is necessray to reach the convergence. So Table~\ref{tab:04} +shows the results of differents executions with differents number of cores and +differents thresholds. As with the previous example, we can observe that TSIRM +is faster than FGMRES. The ratio greatly depends on the number of iterations for +FMGRES to reach the threshold. The greater the number of iterations to reach the +convergence is, the better the ratio between our algorithm and FMGRES is. This +experiment is also a weak scaling with approximately $25,000$ components per +core. It can also be observed that the difference between CGLS and LSQR is not +significant. Both can be good but it seems not possible to know in advance which +one will be the best. + +Table~\ref{tab:05} show a strong scaling experiment with the exemple ex54 on the +Curie architecture. So in this case, the number of unknownws is fixed to +$204,919,225$ and the number of cores ranges from $512$ to $8192$ with the power +of two. The threshold is fixed to $5e-5$ and only the $mg$ preconditioner has +been tested. Here again we can see that TSIRM is faster that FGMRES. Efficiecy +of each algorithms is reported. It can be noticed that FGMRES is more efficient +than TSIRM except with $8,192$ cores and that its efficiency is greater that one +whereas the efficiency of TSIRM is lower than one. Nevertheless, the ratio of +TSIRM with any version of the least-squares method is always faster. With +$8,192$ cores when the number of iterations is far more important for FGMRES, we +can see that it is only slightly more important for TSIRM. + +In Figure~\ref{fig:02} we report the number of iterations per second for +experiments reported in Table~\ref{tab:05}. This Figure highlights that the +number of iterations per seconds is more of less the same for FGMRES and TSIRM +with a little advantage for FGMRES. It can be explained by the fact that, as we +have previously explained, that the iterations of the least-sqaure steps are not +taken into account with TSIRM. \begin{table*}[htbp] \begin{center}