1 \documentclass[11pt,a4paper]{article}
\r
2 \usepackage[latin1]{inputenc}
\r
4 \usepackage{amsfonts}
\r
7 \title{Paper1_kahina}
\r
9 \section{Root finding problem}
\r
10 we consider a polynomial of degree \textit{n} having coefficients
\r
11 in the complex \textit{C} and zeros $\alpha
\r
12 _{i},\textit{i=1,...,n}$. \\
\r
14 {\Large$p(x)=\sum{a_{i}x^{i}}=a_{n}\prod(x-\alpha_{i}),
\r
18 the root finding problem consist to find
\r
19 all n root of \textit{p(x)}. the problem of finding a root is
\r
20 equivalent to the problem of finding a fixed-point. To see this
\r
21 consider the fixed-point problem of finding the n-dimensional
\r
26 where $g : C^{n}\longrightarrow C^{n}$. Note that we can easily
\r
27 rewrite this fixed-point problem as a root-finding problem by
\r
28 setting $f (x) = x-g(x)$ and likewise we can recast the
\r
29 root-finding problem into a fixed-point problem by setting
\r
33 Often it will not be possible to solve such nonlinear equation
\r
34 root-finding problems analytically. When this occurs we turn to
\r
35 numerical methods to approximate the solution. Generally speaking,
\r
36 algorithms for solving problems numerically can be divided into
\r
37 two main groups: direct methods and iterative methods.
\r
39 Direct methods exist only for $n \leqslant4$,solved in closed form by G. Cardano
\r
40 in the mid-16th century. However, N.H. Abel in the early 19th
\r
41 century showed that polynomials of degree five or more could not
\r
42 be solved by directs methods. Since then researchers have
\r
43 concentrated on numerical (iterative) methods such as the famous
\r
44 Newton s method, Bernoulli s method of the 18th, and Graeffe s.
\r
45 With the advent of electronic computers, different methods has
\r
46 been developed such as the Jenkins-Traub method, Larkin s method,
\r
47 Muller s method, and several methods for simultaneous
\r
48 approximation of all the roots, starting with the Durand-Kerner
\r
52 $ Z_{i}=Z_{i}-\frac{P(Z_{i})}{\prod_{i\neq j}(z_{i}-z_{j})} $
\r
55 This formula is mentioned for the first time from Weiestrass [12]
\r
56 as part of the fundamental theorem of Algebra and is rediscovered
\r
57 from Ilieff~\cite{Ilief50} [2], Docev [3], Durand [4], Kerner [5].
\r
58 Another method discovered from Borsch-Supan [6] and also described
\r
59 and brought in the following form from Ehrlich [7] and
\r
60 Aberth~\cite{Aberth73}.
\r
63 $ Z_{i}=Z_{i}-\frac{1}{{\frac {P'(Z_{i})}
\r
64 {P(Z_{i})}}-{\sum_{i\neq j}(z_{i}-z_{j})}} $
\r
67 Aberth, Ehrlich and Farmer-Loizou [10] have proved that the above
\r
68 method has cubic order of convergence for simple roots.
\r
71 Iterative methods raise several problem when implemented e.g.
\r
72 specific sizes of numbers must be used to deal with this
\r
73 difficulty.Moreover,the convergence time of iterative methods
\r
74 drastically increase like the degrees of high polynomials. The
\r
75 parallelization of these algorithms will improve the convergence
\r
78 Many authors have treated the problem of parallelization of
\r
79 simultaneous methods. Freeman [13] has tested the DK method, EA
\r
80 method and another method of the fourth order proposed from Farmer
\r
81 and Loizou [10],on a 8- processor linear chain, for polynomial of
\r
82 degree up to 8. The third method often diverges, but the first two
\r
83 methods have speed-up 5.5 (speed-up=(Time on one processor)/(Time
\r
84 on p processors)). Later Freeman and Bane [14] consider
\r
85 asynchronous algorithms, in which each processor continues to
\r
86 update its approximations even although the latest values of other
\r
87 $z_i((k))$ have not received from the other processors, in
\r
88 difference with the synchronous version where it would wait. in
\r
89 [15]proposed two methods of parallelization for architecture with
\r
90 shared memory and distributed memory,it able to compute the root
\r
91 of polynomial degree 10000 on 430 s with only 8 pc and 2
\r
92 communications per iteration. Compare to the sequential it take
\r
93 3300 s to obtain the same results.
\r
95 After this few works discuses this problem until the apparition of
\r
96 the Compute Unified Device Architecture (CUDA) [19],a parallel
\r
97 computing platform and a programming model invented by NVIDIA. the
\r
98 computing ability of GPU has exceeded the counterpart of CPU. It
\r
99 is a waste of resource to be just a graphics card for GPU. CUDA
\r
100 adopts a totally new computing architecture to use the hardware
\r
101 resources provided by GPU in order to offer a stronger computing
\r
102 ability to the massive data computing.
\r
105 Indeed [16]proposed the implementation of the Durand-Kerner method
\r
106 on GPU (Graphics Processing Unit). The main result prove that a
\r
107 parallel implementation is 10 times as fast as the sequential
\r
108 implementation on a single CPU for high degree polynomials that is
\r
109 greater than about 48000.
\r
111 The mean part of our work is to implement the Aberth method on GPU
\r
112 and compare it with the Durand Kerner
\r
113 implementation.................To be continued..................
\r
116 \section{Aberth method and difficulties}
\r
117 A cubically convergent iteration method for finding zeros of
\r
118 polynomials was proposed by O.Aberth[?].The Aberth method is a
\r
119 purely algebraic derivation.To illustrate the derivation, we let
\r
120 $w_{i}(z)$ be the product of linear factor $ w_{i}(z)=\prod_{j=1,j
\r
121 \neq i}^{n} (z-x_{j})$
\r
123 and rational function $R_{i}(z)$ be the correction term of
\r
124 Weistrass method (?)
\r
125 $$R_{i}(z)=\dfrac{p(z)}{w_{i}(Z)} , i=1,2,...,n. $$
\r
127 Differentiating the rational function $R_{i}(z)$ and applying the
\r
128 Newton method, we have
\r
129 $$\dfrac{R_{i}(z)}{R_{i}^{'}(z)}= \dfrac{p(z)}{p^{'}(z)-p(z)\dfrac{w_{i}(z)}{w_{i}^{'}(z)}}= \dfrac{p(z)}{p^{'}(z)-p(z) \sum _{j=1,j \neq i}^{n}\dfrac{1}{z-x_{i}}}, i=1,2,...,n $$
\r
130 Substituting $x_{j}$ for z we obtain the Aberth iteration method
\r
132 Let present the means stages of Aberth's method.
\r
134 \subsection{Polynomials Initialization}
\r
135 The initialization of polynomial P(z) with complex coefficients
\r
137 $$ p(z)=\sum{a_{i}z^{n-i}}. where a_{n} \neq 0,a_{0}=1, a_{i}\subset C $$
\r
140 \subsection{Vector $Z^{0)}$ Initialization}
\r
142 The choice of the initial points $z^{(0)}_{i} , i = 1, . . . , n,$
\r
143 from which starting the iteration (2) or (3), is rather delicate
\r
144 since the number of steps needed by the iterative method to reach
\r
145 a given approximation strongly depends on it. In [1] the Aberth
\r
146 iteration is started by selecting n equispaced points on a circle
\r
147 of center 0 and radius r, where r is an upper bound to the moduli
\r
148 of the zeros. After[18] performs this choice by selecting complex
\r
149 numbers along different circles and relies on the result of [19].
\r
151 $$\sigma_{0}=\frac{u+v}{2};u=\frac{\sum_{i=1}^{n}u_{i}}{n.max_{i=1}^{n}u_{i}}; v=\frac{\sum_{i=0}^{n-1}v_{i}}{n.min_{i=0}^{n-1}v_{i}};u_{i}=2.|a_{i}|^{\frac{1}{i}}; v_{i}=\frac{|\frac{a_{n}}{a_{i}}|^{\frac{1}{n-i}}}{2} $$
\r
153 \subsection{Iterative Function Hi}
\r
154 The operator used with Aberth's method is corresponding to the
\r
155 following equation which will enable the convergence towards
\r
156 polynomial solutions, provided all the roots are distinct.
\r
158 $$ H_{i}(z)=z_{i}-\frac{1}{\frac{P^{'}(z_{i})}{P(z_{i})}-\sum_{j\neq i}{\frac{1}{z_{i}-z_{j}}}} $$
\r
160 \subsection{Convergence condition}
\r
161 determines the success of the termination. It consists in stopping
\r
162 the iterative function $H_{i}(z)$ when the are stable,the method
\r
163 converge sufficiently:
\r
164 $$ \forall i \in [1,n]; \frac{z_{i}^{(k)}-z_{i}^{(k-1)}}{z_{i}^{(k)}}< \xi$$
\r
166 \section{Difficulties and amelioration}
\r
167 the Aberth method implementation suffer of overflow problems. This
\r
168 situation occurs, for instance, in the case where a polynomial
\r
169 having positive coefficients and large degree is computed at a
\r
170 point $\xi$ where $|\xi| > 1$.Indeed the limited number in the
\r
171 mantissa of floating takings the computation of P(z) wrong when z
\r
172 is large. for example $(10^{50}) +1+ (- 10_{50})$ will give result
\r
173 0 instead of 1 in reality.consequently we can't compute the roots
\r
174 for large polynomial's degree. This problem was discuss in [17]
\r
175 for the Durand-Kerner method, the authors propose to use the
\r
176 logratihm and the exponential of a complex:
\r
178 $$ \forall(x,y)\in R^{*2}; \ln (x+i.y)=\ln(x^{2}+y^{2}) 2+i.\arcsin(y\sqrt{x^{2}+y^{2}})_{\left] -\pi, \pi\right[ } $$
\r
179 $$ \forall(x,y)\in R^{*2}; \exp(x+i.y)= \exp(x).\exp(i.y)$$
\r
180 $$ =\exp(x).\cos(y)+i.\exp(x).\sin(y)$$
\r
183 The application of logarithm can replace any multiplications and
\r
184 divisions with additions and subtractions; consequently it
\r
185 manipulates lower absolute values and can be compute the roots for
\r
186 large polynomial's degree exceed 1000[17].
\r
188 Applying this solution for the Aberth method we obtain the
\r
189 iteration function with logarithm:
\r
190 %%$$ \exp \bigl( \ln(p(z)_{k})-ln(\ln(p(z)_{k}^{'}))- \ln(1- \exp(\ln(p(z)_{k})-ln(\ln(p(z)_{k}^{'})+\ln\sum_{i\neq j}^{n}\frac{1}{z_{k}-z_{j}})$$
\r
192 $$ H_{i}(z)=z_{i}^{k}-\exp \left(\ln \left( p(z_{k})\right)-\ln\left(p(z_{k}^{'})\right)- \ln\left(1- \exp\left( \ln (p(z_{k}))-\ln(p(z_{k}^{'}))+\ln \left( \sum_{k\neq j}^{n}\frac{1}{z_{k}-z_{j}}\right)\right) \right) \right)$$
\r
196 this solution is applying when it is necessary
\r
198 \section{The implementation of simultaneous methods in a parallel computer}
\r
201 \bibliographystyle{plain}
\r
202 \bibliography{biblio}
\r
203 %% \begin{thebibliography}{2}
\r
205 %% \bibitem [1] {1} O. Aberth, Iteration Methods for Finding
\r
208 %% all Zeros of a Polynomial Simultaneously, Math. Comput. 27, 122
\r
209 %% (1973) 339
\96344.
\r
211 %% \bibitem [2] {2} Ilieff, L. (1948-50), On the approximations of Newton, Annual
\r
212 %% Sofia Univ. 46, 167-171.
\r
214 %% \bibitem [3] {3} Docev, K. (1962), An alternative method of Newton for
\r
215 %% simultaneous calculation of all the roots of a given algebraic
\r
216 %% equation, Phys. Math. J., Bulg. Acad. Sci. 5, 136-139.
\r
218 %% \bibitem [4]{4} Durand, E. (1960), Solution Numerique des Equations
\r
219 %% Algebriques, Vol. 1, Equations du Type F(x)=0, Racines d'une
\r
220 %% Polynome. Masson, Paris.
\r
222 %% \bibitem [4] {4} Aberth, O. (1973), Iterative methods for finding all zeros of
\r
223 %% a polynomial simultaneously, Math. Comp. 27, 339-344.
\r
225 %% \bibitem [5] {5} Kerner, I.O. (1966), Ein Gesamtschritteverfahren zur
\r
226 %% Berechnung der Nullstellen von Polynomen, Numer. Math. 8, 290-294.
\r
228 %% \bibitem [6]{6} Borch-Supan, W. (1963), A posteriori error for the zeros of
\r
229 %% polynomials, Numer. Math. 5, 380-398.
\r
231 %% \bibitem [7] {7} Ehrlich, L. W. (1967), A modified Newton method for
\r
232 %% polynomials, Comm. Ass. Comput. Mach. 10, 107-108.
\r
236 %% \bibitem [10] {10}Loizon, G. (1983), Higher-order iteration functions for
\r
237 %% simultaneously approximating polynomial zeros, Intern. J. Computer
\r
238 %% Math. 14, 45-58.
\r
240 %% \bibitem [11]{11} E. Durand, Solutions numŽeriques des Žequations algŽebriques,
\r
241 %% Tome 1: Equations du type F(X) = 0; Racines d
\92un polyn
\88ome,
\r
242 %% Masson, Paris 1960.
\r
244 %% \bibitem [12] {12} Weierstrass, K. (1903), Neuer Beweis des Satzes, dass
\r
245 %% jede ganze rationale function einer veranderlichen dagestellt
\r
246 %% werden kann als ein product aus linearen functionen derselben
\r
247 %% veranderlichen, Ges. Werke 3, 251-269.
\r
248 %% \bibitem [13] {13} Freeman, T. L. (1989), Calculating polynomial zeros on a
\r
249 %% local memory parallel computer, Parallel Computing 12, 351-358.
\r
251 %% \bibitem [14] {14} Freeman, T. L., Brankin, R. K. (1990), Asynchronous
\r
252 %% polynomial zero-finding algorithms, Parallel Computing 17,
\r
255 %% \bibitem [15] {15} Raphaël,C. François,S. (2001), Extraction de racines dans des
\r
256 %% polynômes creux de degré élevé. RSRCP (Réseaux et Systèmes
\r
257 %% Répartis, Calculateurs Parallèles), Numéro thématique :
\r
258 %% Algorithmes itératifs parallèles et distribués, 13(1):67--81.
\r
260 %% \bibitem [16]{16} Kahina, G. Raphaël, C. Abderrahmane, S. A
\r
261 %% parallel implementation of the Durand-Kerner algorithm for
\r
262 %% polynomial root-finding on GPU. In INDS 2014, Int. Conf. on
\r
263 %% advanced Networking, Distributed Systems and Applications, Bejaia,
\r
264 %% Algeria, pages 53--57, June 2014. IEEE
\r
266 %% \bibitem [17] {17} Karim Rhofir, François Spies, and Jean-Claude Miellou.
\r
267 %%Perfectionnements de la méthode asynchrone de Durand-Kerner pour
\r
268 %%les polynômes complexes. Calculateurs Parallèles, 10(4):449-- 458,
\r
270 %% \bibitem [18] {18} Bini, D. A. Numerical computation of polynomial zeros by
\r
271 %%means of Aberth
\92s method. Numerical Algorithms 13 (1996), 179
\96\r
273 %% \bibitem [19] {19} A. Ostrowski, On a Theorem by J.L. Walsh Concerning the Moduli of Roots of Algebraic Equations,
\r
274 %%Bull. A.M.S., 47 (1941) 742
\96746.
\r
275 %% \end{thebibliography}
\r