From: Kahina Date: Mon, 11 Jan 2016 19:39:46 +0000 (+0100) Subject: Quelques MAJ X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/kahina_paper2.git/commitdiff_plain/369ec1155d40a2375303ba29bfb493136cf7b944?hp=d76a512142871f86c35be3ebcea8c16db48a8e4a Quelques MAJ --- diff --git a/paper.tex b/paper.tex index ea50906..9c918b8 100644 --- a/paper.tex +++ b/paper.tex @@ -597,7 +597,7 @@ CUDA (Compute Unified Device Architecture) is a parallel computing architecture %Here we give a second form of the iterative function used by the Ehrlich-Aberth method: %\begin{equation} -%\label{Eq:EA1} +%\label{Eq:EA-1} %EA: z^{k+1}_{i}=z_{i}^{k}-\frac{\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}} %{1-\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}\sum_{j=1,j\neq i}^{j=n}{\frac{1}{(z_{i}^{k}-z_{j}^{k})}}}, %i=1,. . . .,n %\end{equation} @@ -606,7 +606,7 @@ CUDA (Compute Unified Device Architecture) is a parallel computing architecture %The convergence condition determines the termination of the algorithm. It consists in stopping the %iterative function when the roots are sufficiently stable. We consider that the method converges %sufficiently when: %\begin{equation} -%\label{eq:Aberth-Conv-Cond} +%\label{eq:AAberth-Conv-Cond} %\forall i \in [1,n];\vert\frac{z_{i}^{k}-z_{i}^{k-1}}{z_{i}^{k}}\vert<\xi %\end{equation} @@ -654,9 +654,36 @@ R = exp(log(DBL\_MAX)/(2*n) ); %R = \exp( \log(DBL\_MAX) / (2*n) ) %\end{equation} where \verb=DBL_MAX= stands for the maximum representable \verb=double= value. + +In order to hold into account the limit of size of floats, we propose to modifying the iterative function and compute the logarithm of: + +\begin{equation} +EA: z^{k+1}_{i}=z_{i}^{k}-\frac{\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}} +{1-\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}\sum_{j=1,j\neq i}^{j=n}{\frac{1}{(z_{i}^{k}-z_{j}^{k})}}}, i=1,. . . .,n +\end{equation} + +This method allows, indeed, to exceed the computation of the polynomials of degree 100,000 and to reach a degree upper to 1,000,000. For that purpose, it is necessary to use the logarithm and the exponential of a complex. The iterative function of Ehrlich-Aberth method with exponential and logarithm is given as following: + +\begin{equation} +\label{Log_H2} +EA.EL: z^{k+1}_{i}=z_{i}^{k}-\exp \left(\ln \left( +p(z_{i}^{k})\right)-\ln\left(p'(z^{k}_{i})\right)- \ln\left(1-Q(z^{k}_{i})\right)\right), +\end{equation} + +where: + +\begin{equation} +\label{Log_H1} +Q(z^{k}_{i})=\exp\left( \ln (p(z^{k}_{i}))-\ln(p'(z^{k}_{i}))+\ln \left( +\sum_{i\neq j}^{n}\frac{1}{z^{k}_{i}-z^{k}_{j}}\right)\right)i=1,...,n, +\end{equation} + + +%We propose to use the logarithm and the exponential of a complex in order to compute the power at a high exponent. +Using the logarithm and the exponential operators, we can replace any multiplications and divisions with additions and subtractions. Consequently, computations manipulate lower absolute values and the roots for large polynomial degrees can be looked for successfully~\cite{Karimall98}. -This problem was discussed earlier in~\cite{Karimall98} for the Durand-Kerner method. The authors -propose to use the logarithm and the exponential of a complex in order to compute the power at a high exponent. Using the logarithm and the exponential operators, we can replace any multiplications and divisions with additions and subtractions. Consequently, computations manipulate lower absolute values and the roots for large polynomial degrees can be looked for successfully~\cite{Karimall98}. +%This problem was discussed earlier in~\cite{Karimall98} for the Durand-Kerner method. The authors +%propose to use the logarithm and the exponential of a complex in order to compute the power at a high exponent. Using the logarithm and the exponential operators, we can replace any multiplications and divisions with additions and subtractions. Consequently, computations manipulate lower absolute values and the roots for large polynomial degrees can be looked for successfully~\cite{Karimall98}. \subsection{Ehrlich-Aberth parallel implementation on CUDA} We introduced three paradigms of parallel programming. Our objective consists in implementing a root finding polynomial algorithm on multiple GPUs. To this end, it is primordial to know how to manage CUDA contexts of different GPUs. A direct method for controlling the various GPUs is to use as many threads or processes as GPU devices. We can choose the GPU index based on the identifier of OpenMP thread or the rank of the MPI process. Both approaches will be investigated.