X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/kahina_paper1.git/blobdiff_plain/6c18d76246cbf2749bdea5aa9b6f21090f3394f9..19176badff9356052f1475e4fb43863027140bc2:/paper.tex?ds=inline diff --git a/paper.tex b/paper.tex index 91bdab7..a72d177 100644 --- a/paper.tex +++ b/paper.tex @@ -321,6 +321,7 @@ Applying this solution for the Aberth method we obtain the iteration function with logarithm: %%$$ \exp \bigl( \ln(p(z)_{k})-ln(\ln(p(z)_{k}^{'}))- \ln(1- \exp(\ln(p(z)_{k})-ln(\ln(p(z)_{k}^{'})+\ln\sum_{i\neq j}^{n}\frac{1}{z_{k}-z_{j}})$$ \begin{equation} +\label{Log_H2} H_{i}(z)=z_{i}^{k}-\exp \left(\ln \left( p(z_{k})\right)-\ln\left(p(z_{k}^{'})\right)- \ln \left(1-Q(z_{k})\right)\right), @@ -329,12 +330,17 @@ p(z_{k})\right)-\ln\left(p(z_{k}^{'})\right)- \ln where: \begin{equation} +\label{Log_H1} Q(z_{k})=\exp\left( \ln (p(z_{k}))-\ln(p(z_{k}^{'}))+\ln \left( \sum_{k\neq j}^{n}\frac{1}{z_{k}-z_{j}}\right)\right). \end{equation} This solution is applied when the root except the circle unit, represented by the radius $R$ evaluated as: -$$R = \exp( \log(DBL\_MAX) / (2*n) )$$ where $DBL\_MAX$ stands for the maximum representable double value. +\begin{equation} +\label{R} +R = \exp( \log(DBL\_MAX) / (2*n) ) +\end{equation} + where $DBL\_MAX$ stands for the maximum representable double value. \section{The implementation of simultaneous methods in a parallel computer} \label{secStateofArt} @@ -642,7 +648,7 @@ We report the execution times of the Ehrlisch-Aberth method implemented on one c \subsubsection{Influence of the number of threads on the execution times of different polynomials (sparse and full)} -It is also interesting to see the influence of the number of threads per block on the execution time. For that, we notice that the maximum number of threads per block for the Nvidia Tesla K40c GPU is 1024, so we varied the number of threads per block from 8 to 1024.we took into account the execution time for both sparse and full polynomials of size 50000 and 500000 degrees. +It is also interesting to see the influence of the number of threads per block on the execution time. For that, we notice that the maximum number of threads per block for the Nvidia Tesla K40c GPU is 1024, so we varied the number of threads per block from 8 to 1024. We took into account the execution time for both sparse and full polynomials of size 50000 and 500000 degrees. \begin{figure}[H] \centering @@ -651,27 +657,39 @@ It is also interesting to see the influence of the number of threads per block o \label{fig:01} \end{figure} -The figure 2 show that, the best execution time for both sparse and full polynomial are given while the threads number varies between 64 and 256 threads per bloc. We notice that with small polynomials the number of threads per block is 64, Whereas, the large polynomials is 256. However,In the following experiments we specify that the number of thread by block is 256. +The figure 2 show that, the best execution time for both sparse and full polynomial are given while the threads number varies between 64 and 256 threads per bloc. We notice that with small polynomials the number of threads per block is 64, Whereas, the large polynomials the number of threads per block is 256. However,In the following experiments we specify that the number of thread by block is 256. \subsubsection{The impact of exp-log solution to compute very high degrees of polynomial} +In this experiment we report the performance of log.exp solution describe in ~\ref{sec2} to compute very high degrees polynomials. \begin{figure}[H] \centering \includegraphics[width=0.8\textwidth]{figures/log_exp} \caption{The impact of exp-log solution to compute very high degrees of polynomial.} \label{fig:01} \end{figure} - -\subsubsection{A comparative study between Aberth and Durand-kerner algorithm} + +The figure 3, show a comparison between the execution time of the Ehrlisch-Aberth algorithm applying log-exp solution and the execution time of the Ehrlisch-Aberth algorithm without applying log-exp solution, with full polynomials degrees. We can see that the execution time for the both algorithms are the same while the polynomials degrees are less than 4500. After,we show clearly that the classical version of Ehrlisch-Aberth algorithm (without applying log.exp) stop to converge and can not solving polynomial exceed 4500, in counterpart, the new version of Ehrlisch-Aberth algorithm (applying log.exp solution) can solve very high and large full polynomial exceed 100,000 degrees. + +in fact, when the modulus of the roots are up than \textit{R} given in ~\ref{R},this exceed the limited number in the mantissa of floating points representations and can not compute the iterative function given in ~\ref{eq:Aberth-H-GS} to obtain the root solution, who justify the divergence of the classical Ehrlisch-Aberth algorithm. However, applying log.exp solution given in ~\ref{sec2} took into account the limit of floating using the iterative function in(Eq.~\ref{Log_H1},Eq.~\ref{Log_H2}and allows to solve a very large polynomials degrees . + +%we report the performances of the exp.log for the Ehrlisch-Aberth algorithm for solving very high degree of polynomial. + + +\subsubsection{A comparative study between Ehrlisch-Aberth algorithm and Durand-kerner algorithm} +In this part, we are interesting to compare the simultaneous methods, Ehrlisch-Aberth and Durand-Kerner in parallel computer using GPU. We took into account the execution time, the number of iteration and the polynomial's size. for the both sparse and full polynomials. + \begin{figure}[H] \centering \includegraphics[width=0.8\textwidth]{figures/EA_DK} -\caption{The execution time of Ehrlisch-Aberth versus Durand-Kerner algorithm} +\caption{The execution time of Ehrlisch-Aberth versus Durand-Kerner algorithm on GPU} \label{fig:01} \end{figure} +This figure show the execution time of the both algorithm EA and DK with sparse polynomial degrees ranging from 1000 to 1000000. We can see that the Ehrlisch-Aberth algorithm are faster than Durand-Kerner algorithm, with an average of 25 times as fast. Then, when degrees of polynomial exceed 500000 the execution time with EA is of the order 100 whereas DK passes in the order 1000. %with double precision not exceed $10^{-5}$. + \begin{figure}[H] \centering \includegraphics[width=0.8\textwidth]{figures/EA_DK_nbr}