X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/kahina_paper2.git/blobdiff_plain/a5593293fa55ad05530b93cc0550f7c0d9a9545b..c4fa5c419203fe392a24d237e1723e373ba37073:/paper.tex?ds=sidebyside diff --git a/paper.tex b/paper.tex index 7d4fe8a..308e42a 100644 --- a/paper.tex +++ b/paper.tex @@ -75,28 +75,35 @@ where $\{\alpha_i\}_{0\leq i\leq n}$ are complex coefficients and $n$ is a high Most of the numerical methods that deal with the polynomial root-finding problem are simultaneous methods, \textit{i.e.} the iterative methods to find simultaneous approximations of the $n$ polynomial roots. These methods start from the initial approximations of all $n$ polynomial roots and give a sequence of approximations that converge to the roots of the polynomial. Two examples of well-known simultaneous methods for root-finding problem of polynomials are Durand-Kerner method~\cite{Durand60,Kerner66} and Ehrlich-Aberth method~\cite{Ehrlich67,Aberth73}. -The main problem of the simultaneous methods is that the necessary -time needed for the convergence increases with the increasing of the -polynomial's degree. Many authors have treated the problem of -implementing simultaneous methods in -parallel. Freeman~\cite{Freeman89} implemented and compared -Durand-Kerner method, Ehrlich-Aberth method and another method of the -fourth order of convergence proposed by Farmer and -Loizou~\cite{Loizou83} on a 8-processor linear chain, for polynomials -of degree up-to 8. The method of Farmer and Loizou~\cite{Loizou83} -often diverges, but the first two methods (Durand-Kerner and -Ehrlich-Aberth methods) have a speed-up equals to 5.5. Later, Freeman -and Bane~\cite{Freemanall90} considered asynchronous algorithms in -which each processor continues to update its approximations even -though the latest values of other approximations $z^{k}_{i}$ have not -been received from the other processors, in contrast with synchronous -algorithms where it would wait those values before making a new -iteration. Couturier and al.~\cite{cs01:nj} proposed two methods -of parallelization for a shared memory architecture with OpenMP and -for a distributed memory one with MPI. They are able to compute the -roots of sparse polynomials of degree 10,000. The authors showed an interesting -speedup that is 20 times as fast as the sequential implementation. +The convergence time of simultaneous methods drastically increases with the increasing of the polynomial's degree. the great challenges with simultaneous methods is the primordial need to parallelize it and will improve the convergence time. Many authors have treated to +implement simultaneous methods in parallel. +Many authors have treated to implement simultaneous methods in parallel(~\cite{Freeman89},~\cite{Loizou83}, ~\cite{Freemanall90}, ~\cite{Raphaelall01},~\cite{Couturier02}), using several paradigms of parallelization (synchronous or asynchronous calculus, mechanism of shared or distributed memory,...). +%but they can not able to solve very high polynomial's degree exceed to 100,000. +They are able to solve only polynomial's degree not exceed 20,000. + +%The main problem of the simultaneous methods is that the necessary +%time needed for the convergence increases with the increasing of the +%polynomial's degree. Many authors have treated the problem of +%implementing simultaneous methods in +%parallel. Freeman~\cite{Freeman89} implemented and compared +%Durand-Kerner method, Ehrlich-Aberth method and another method of the +%fourth order of convergence proposed by Farmer and +%Loizou~\cite{Loizou83} on a 8-processor linear chain, for polynomials +%of degree up-to 8. The method of Farmer and Loizou~\cite{Loizou83} +%often diverges, but the first two methods (Durand-Kerner and +%Ehrlich-Aberth methods) have a speed-up equals to 5.5. Later, Freeman +%and Bane~\cite{Freemanall90} considered asynchronous algorithms in +%which each processor continues to update its approximations even +%though the latest values of other approximations $z^{k}_{i}$ have not +%been received from the other processors, in contrast with synchronous +%algorithms where it would wait those values before making a new +%iteration. Couturier and al.~\cite{cs01:nj} proposed two methods +%of parallelization for a shared memory architecture with OpenMP and +%for a distributed memory one with MPI. They are able to compute the +%roots of sparse polynomials of degree 10,000. The authors showed an interesting +%speedup that is 20 times as fast as the sequential implementation. +\KG{je viens de modifier le paragraphe ci-dessus pour eviter des similarité avec d'autre article, merci de le relire....} Very few work had been performed since then until the appearing of the Compute Unified Device Architecture (CUDA)~\cite{CUDA15}, a parallel computing platform and a programming model invented by NVIDIA. The computing power of GPUs (Graphics Processing Units) has exceeded that of traditional processors CPUs. However, CUDA adopts a totally new computing architecture to use the hardware resources provided by the GPU in order to offer a stronger computing ability to the massive data computing. Ghidouche et al.~\cite{Kahinall14} proposed an implementation of the Durand-Kerner method on a single GPU. Their main results showed that a parallel CUDA implementation is about 10 times faster than the sequential implementation on a single CPU for sparse polynomials of degree 48,000. %\LZK{Y a pas d'autres travaux pour la résolution de polynômes sur GPUs?}