From: Kahina Date: Thu, 29 Oct 2015 12:54:04 +0000 (+0100) Subject: correction des equations X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/kahina_paper1.git/commitdiff_plain/6d04681fa87aadf0e6303a76c9c319fb79948d10 correction des equations --- diff --git a/paper.tex b/paper.tex index 0ff4207..9fee850 100644 --- a/paper.tex +++ b/paper.tex @@ -115,14 +115,14 @@ The root finding problem consists in finding the values of all the $n$ values of \end{equation} The problem of finding a root is equivalent to that of solving a fixed-point problem. To see this, consider the fixed-point problem of finding the $n$-dimensional -vector $x$ such that +vector $x$ such that : \begin{center} $x=g(x)$ \end{center} where $g : C^{n}\longrightarrow C^{n}$. Usually, we can easily rewrite this fixed-point problem as a root-finding problem by setting $f(x) = x-g(x)$ and likewise we can recast the -root-finding problem into a fixed-point problem by setting +root-finding problem into a fixed-point problem by setting : \begin{center} $g(x)= f(x)-x$. \end{center} @@ -147,10 +147,10 @@ approximation of all the roots, starting with the Durand-Kerner (DK) method: %%\begin{center} \begin{equation} - Z_i^{k+1}=Z_{i}^k-\frac{P(Z_i^k)}{\prod_{i\neq j}(Z_i^k-Z_j^k)} + z_i^{k+1}=z_{i}^k-\frac{P(z_i^k)}{\prod_{i\neq j}(z_i^k-z_j^k)} \end{equation} %%\end{center} -where $Z_i^k$ is the $i^{th}$ root of the polynomial $P$ at the +where $z_i^k$ is the $i^{th}$ root of the polynomial $P$ at the iteration $k$. @@ -164,11 +164,11 @@ in the following form by Ehrlich~\cite{Ehrlich67} and Aberth~\cite{Aberth73} uses a different iteration formula given as fellows : %%\begin{center} \begin{equation} - Z_i^{k+1}=Z_i^k-\frac{1}{{\frac {P'(Z_i^k)} {P(Z_i^k)}}-{\sum_{i\neq j}(Z_i^k-Z_j^k)}}. + z_i^{k+1}=z_i^k-\frac{1}{{\frac {P'(z_i^k)} {P(z_i^k)}}-{\sum_{i\neq j}\frac{1}{(z_i^k-z_j^k)}}}. \end{equation} %%\end{center} -where $P'(Z)$ is the polynomial derivative of $P$ evaluated in the -point $Z$. +where $P'(z)$ is the polynomial derivative of $P$ evaluated in the +point $z$. Aberth, Ehrlich and Farmer-Loizou~\cite{Loizon83} have proved that the Ehrlich-Aberth method (EA) has a cubic order of convergence for simple roots whereas the Durand-Kerner has a quadratic order of convergence. @@ -201,8 +201,7 @@ where it takes up to 3300 seconds to obtain the same results, the authors show a Very few works had been since this last work until the appearing of the Compute Unified Device Architecture (CUDA)~\cite{CUDA10}, a parallel computing platform and a programming model invented by -NVIDIA. The computing power of GPUs (Graphics Processing Unit) has exceeded that of -of CPUs. However, CUDA adopts a totally new computing architecture to use the +NVIDIA. The computing power of GPUs (Graphics Processing Unit) has exceeded that of CPUs. However, CUDA adopts a totally new computing architecture to use the hardware resources provided by GPU in order to offer a stronger computing ability to the massive data computing. @@ -214,9 +213,9 @@ the sequential implementation on a single CPU for high degree polynomials of about 48000. To our knowledge, it is the first time such high degree polynomials are numerically solved. -In this paper, we focus on the implementation of the Aberth method for -high degree polynomials on GPU. The paper is organised as fellows. Initially, we recall the Aberth method in Section.\ref{sec1}. Improvements for the Aberth method are proposed in Section.\ref{sec2}. Related work to the implementation of simultaneous methods using a parallel approach is presented in Section.\ref{secStateofArt}. -In Section.4 we propose a parallel implementation of the Aberth method on GPU and discuss it. Section 5 presents and investigates our implementation and experimental study results. Finally, Section 6 concludes this paper and gives some hints for future research directions in this topic. +In this paper, we focus on the implementation of the Ehrlich-Aberth method for +high degree polynomials on GPU. The paper is organized as fellows. Initially, we recall the Ehrlich-Aberth method in Section \ref{sec1}. Improvements for the Ehrlich-Aberth method are proposed in Section \ref{sec2}. Related work to the implementation of simultaneous methods using a parallel approach is presented in Section \ref{secStateofArt}. +In Section \ref{sec5} we propose a parallel implementation of the Ehrlich-Aberth method on GPU and discuss it. Section \ref{sec6} presents and investigates our implementation and experimental study results. Finally, Section\ref{sec7} 6 concludes this paper and gives some hints for future research directions in this topic. \section{The Sequential Aberth method} \label{sec1} @@ -240,10 +239,10 @@ Differentiating the rational function $R_{i}(z)$ and applying the Newton method, we have: \begin{equation} -\frac{R_{i}(z)}{R_{i}^{'}(z)}= \frac{p(z)}{p^{'}(z)-p(z)\frac{w_{i}(z)}{w_{i}^{'}(z)}}= \frac{p(z)}{p^{'}(z)-p(z) \sum _{j=1,j \neq i}^{n}\frac{1}{z-x_{i}}}, i=1,2,...,n +\frac{R_{i}(z)}{R_{i}^{'}(z)}= \frac{p(z)}{p^{'}(z)-p(z)\frac{w_{i}(z)}{w_{i}^{'}(z)}}= \frac{p(z)}{p^{'}(z)-p(z) \sum _{j=1,j \neq i}^{n}\frac{1}{z-x_{j}}}, i=1,2,...,n \end{equation} - -Substituting $x_{j}$ for z we obtain the Aberth iteration method. +where R_{i}^{'}(z)is the rational function derivative of F evaluated in the point z +Substituting $x_{j}$ for $z_{j}$ we obtain the Aberth iteration method. In the fellowing we present the main stages of the running of the Aberth method. @@ -264,7 +263,7 @@ The initial guess is very important since the number of steps needed by the iter a given approximation strongly depends on it. In~\cite{Aberth73} the Aberth iteration is started by selecting $n$ equi-spaced points on a circle of center 0 and radius r, where r is -an upper bound to the moduli of the zeros. Later, Bini et al.~\cite{Bini96} +an upper bound to the moduli of the zeros. Later, Bini and al.~\cite{Bini96} performed this choice by selecting complex numbers along different circles and relies on the result of~\cite{Ostrowski41}. @@ -281,14 +280,14 @@ u_{i}=2.|a_{i}|^{\frac{1}{i}}; v_{i}=\frac{|\frac{a_{n}}{a_{i}}|^{\frac{1}{n-i}}}{2}. \end{equation} -\subsection{Iterative Function $H_{i}$} +\subsection{Iterative Function $H_{i}(z^{k})$} The operator used by the Aberth method is corresponding to the following equation which will enable the convergence towards polynomial solutions, provided all the roots are distinct. \begin{equation} -H_{i}(z)=z_{i}-\frac{1}{\frac{p^{'}(z_{i})}{p(z_{i})}-\sum_{j\neq -i}{\frac{1}{z_{i}-z_{j}}}} +H_{i}(z^{k+1})=z_{i}^{k}-\frac{1}{\frac{p^{'}(z_{i}^{k})}{p(z_{i}^{k})}-\sum_{j=1,j\neq +i}^{j=n}{\frac{1}{z_{i}^{k}-z_{j}^{k}}}} \end{equation} \subsection{Convergence Condition} @@ -299,7 +298,7 @@ converges sufficiently when : \begin{equation} \label{eq:Aberth-Conv-Cond} \forall i \in -[1,n];\frac{z_{i}^{(k)}-z_{i}^{(k-1)}}{z_{i}^{(k)}}<\xi +[1,n];\frac{z_{i}^{k}-z_{i}^{k-1}}{z_{i}^{k}}<\xi \end{equation} @@ -332,22 +331,22 @@ propose to use the logarithm and the exponential of a complex in order to comput Using the logarithm (eq.~\ref{deflncomplex}) and the exponential (eq.~\ref{defexpcomplex}) operators, we can replace any multiplications and divisions with additions and subtractions. Consequently, computations manipulate lower absolute values and the roots for large polynomial's degrees can be looked for successfully~\cite{Karimall98}. -Applying this solution for the Aberth method we obtain the +Applying this solution for the Ehrlich-Aberth method we obtain the iteration function with logarithm: %%$$ \exp \bigl( \ln(p(z)_{k})-ln(\ln(p(z)_{k}^{'}))- \ln(1- \exp(\ln(p(z)_{k})-ln(\ln(p(z)_{k}^{'})+\ln\sum_{i\neq j}^{n}\frac{1}{z_{k}-z_{j}})$$ \begin{equation} \label{Log_H2} -H_{i}(z)=z_{i}^{k}-\exp \left(\ln \left( -p(z_{k})\right)-\ln\left(p(z_{k}^{'})\right)- \ln -\left(1-Q(z_{k})\right)\right), +H_{i}(z^{k+1})=z_{i}^{k}-\exp \left(\ln \left( +p(z_{i}^{k})\right)-\ln\left(p'(z^{k}_{i})\right)- \ln +\left(1-Q(z^{k}_{i})\right)\right), \end{equation} where: \begin{equation} \label{Log_H1} -Q(z_{k})=\exp\left( \ln (p(z_{k}))-\ln(p(z_{k}^{'}))+\ln \left( -\sum_{k\neq j}^{n}\frac{1}{z_{k}-z_{j}}\right)\right). +Q(z^{k}_{i})=\exp\left( \ln (p(z^{k}_{i}))-\ln(p'(z^{k}^{i}))+\ln \left( +\sum_{k\neq j}^{n}\frac{1}{z^{k}_{i}-z^{k}_{j}}\right)\right). \end{equation} This solution is applied when the root except the circle unit, represented by the radius $R$ evaluated as: @@ -466,7 +465,8 @@ provides two read-only memory spaces, the constant space and the texture space, which reside in external DRAM, and are accessed via read-only caches. -\subsection{ The implementation of Aberth method on GPU} +\section{ The implementation of Aberth method on GPU} +\label{sec5} %%\subsection{A CUDA implementation of the Aberth's method } %%\subsection{A GPU implementation of the Aberth's method } @@ -510,15 +510,15 @@ In this sequential algorithm, one CPU thread executes all the steps. Let us loo There exists two ways to execute the iterative function that we call a Jacobi one and a Gauss-Seidel one. With the Jacobi iteration, at iteration $k+1$ we need all the previous values $z^{(k)}_{i}$ to compute the new values $z^{(k+1)}_{i}$, that is : \begin{equation} -H(i,z^{k+1})=\frac{p(z^{(k)}_{i})}{p'(z^{(k)}_{i})-p(z^{(k)}_{i})\sum^{n}_{j=1 j\neq i}\frac{1}{z^{(k)}_{i}-z^{(k)}_{j}}}, i=1,...,n. +H_{i}(z^{k+1})=\frac{p(z^{k}_{i})}{p'(z^{k}_{i})-p(z^{k}_{i})\sum^{n}_{j=1 j\neq i}\frac{1}{z^{k}_{i}-z^{k}_{j}}}, i=1,...,n. \end{equation} With the Gauss-seidel iteration, we have: \begin{equation} \label{eq:Aberth-H-GS} -H(i,z^{k+1})=\frac{p(z^{(k)}_{i})}{p'(z^{(k)}_{i})-p(z^{(k)}_{i})(\sum^{i-1}_{j=1}\frac{1}{z^{(k)}_{i}-z^{(k+1)}_{j}}+\sum^{n}_{j=i+1}\frac{1}{z^{(k)}_{i}-z^{(k)}_{j}})}, i=1,...,n. +H_{i}(z^{k+1})=\frac{p(z^{k}_{i})}{p'(z^{k}_{i})-p(z^{k}_{i})(\sum^{i-1}_{j=1}\frac{1}{z^{k}_{i}-z^{k+1}_{j}}+\sum^{n}_{j=i+1}\frac{1}{z^{k}_{i}-z^{k}_{j}})}, i=1,...,n. \end{equation} - +%%Here a finiched my revision %% Using Equation.~\ref{eq:Aberth-H-GS} for the update sub-step of $H(i,z^{k+1})$, we expect the Gauss-Seidel iteration to converge more quickly because, just as its ancestor (for solving linear systems of equations), it uses the most fresh computed roots $z^{k+1}_{i}$. The $4^{th}$ step of the algorithm checks the convergence condition using Equation.~\ref{eq:Aberth-Conv-Cond}. @@ -609,7 +609,7 @@ The kernels terminate it computations when all the roots converge. Finally, the or from GPU memory to CPU memory \verb=(cudaMemcpyDeviceToHost))=. %%HIER END MY REVISIONS (SIDER) \section{Experimental study} - +\label{sec6} \subsection{Definition of the used polynomials } We study two categories of polynomials : the sparse polynomials and the full polynomials. \paragraph{A sparse polynomial}: is a polynomial for which only some coefficients are not null. We use in the following polonymial for which the roots are distributed on 2 distinct circles : @@ -731,7 +731,7 @@ This figure show the execution time of the both algorithm EA and DK with sparse \section{Conclusion and perspective} - +\label{sec7} \bibliography{mybibfile} \end{document}