Couturier and al.~\cite{Raphaelall01} proposed two methods of parallelization for
a shared memory architecture and for distributed memory one. They were able to
compute the roots of sparse polynomials of degree 10,000 in 430 seconds with only 8
-personal computers and 2 communications per iteration. Comparing to the sequential implementation
-where it takes up to 3,300 seconds to obtain the same results, the authors show an interesting speedup.
+personal computers and 2 communications per iteration. Compared to sequential implementations
+where it takes up to 3,300 seconds to obtain the same results, the
+authors' work experiment show an interesting speedup.
-Very few works had been performed since this last work until the appearing of
+Few works have been conducted after those works until the appearance of
the Compute Unified Device Architecture (CUDA)~\cite{CUDA10}, a
parallel computing platform and a programming model invented by
NVIDIA. The computing power of GPUs (Graphics Processing Unit) has exceeded that of CPUs. However, CUDA adopts a totally new computing architecture to use the
\section{Ehrlich-Aberth method}
\label{sec1}
-A cubically convergent iteration method for finding zeros of
-polynomials was proposed by O. Aberth~\cite{Aberth73}. The Ehrlich-Aberth method contain 4 main steps, presented in the following.
+A cubically convergent iteration method to find zeros of
+polynomials was proposed by O. Aberth~\cite{Aberth73}. The
+Ehrlich-Aberth method contains 4 main steps, presented in what
+follows.
+
%The Aberth method is a purely algebraic derivation.
%To illustrate the derivation, we let $w_{i}(z)$ be the product of linear factors