\begin{equation}
\label{Eq:Hi}
EA2: z^{k+1}_{i}=z_{i}^{k}-\frac{\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}}
-{1-\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}\sum_{j=1,j\neq i}^{j=n}{\frac{1}{(z_{i}^{k}-z_{j}^{k})}}}, i=0,. . . .,n
+{1-\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}\sum_{j=1,j\neq i}^{j=n}{\frac{1}{(z_{i}^{k}-z_{j}^{k})}}}, i=1,. . . .,n
\end{equation}
It can be noticed that this equation is equivalent to Eq.~\ref{Eq:EA},
but we prefer the latter one because we can use it to improve the
finding algorithms. The first approach aims at reducing the total number of
iterations as by Miranker
~\cite{Mirankar68,Mirankar71}, Schedler~\cite{Schedler72} and
-Winogard~\cite{Winogard72}. The second approach aims at reducing the
+Winograd~\cite{Winogard72}. The second approach aims at reducing the
computation time per iteration, as reported
in~\cite{Benall68,Jana06,Janall99,Riceall06}.
in~\cite{Mirankar71} proposed two versions of parallel algorithm
for the Durand-Kerner method, and Ehrlich-Aberth method on a model of
Optoelectronic Transpose Interconnection System (OTIS).The
-algorithms are mapped on an OTIS-2D torus using N processors. This
-solution needs N processors to compute N roots, which is not
+algorithms are mapped on an OTIS-2D torus using $N$ processors. This
+solution needs $N$ processors to compute $N$ roots, which is not
practical for solving polynomials with large degrees.
%Until very recently, the literature did not mention implementations
%able to compute the roots of large degree polynomials (higher then
of a root finding method on GPUs, that of the Durand-Kerner method. The main result showed
that a parallel CUDA implementation is 10 times as fast as the
sequential implementation on a single CPU for high degree
-polynomials of 48000.
+polynomials of 48,000.
%In this paper we present a parallel implementation of Ehrlich-Aberth
%method on GPUs for sparse and full polynomials with high degree (up
%to $1,000,000$).
In order to implement the Ehrlich-Aberth method in CUDA, it is
possible to use the Jacobi scheme or the Gauss Seidel one. With the
Jacobi iteration, at iteration $k+1$ we need all the previous values
-$z^{(k)}_{i}$ to compute the new values $z^{(k+1)}_{i}$, that is :
+$z^{k}_{i}$ to compute the new values $z^{k+1}_{i}$, that is :
\begin{equation}
-EAJ: z^{k+1}_{i}=\frac{p(z^{k}_{i})}{p'(z^{k}_{i})-p(z^{k}_{i})\sum^{n}_{j=1 j\neq i}\frac{1}{z^{k}_{i}-z^{k}_{j}}}, i=1,...,n.
+EAJ: z^{k+1}_{i}=z_{i}^{k}-\frac{\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}}
+{1-\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}\sum_{j=1,j\neq i}^{j=n}{\frac{1}{(z_{i}^{k}-z_{j}^{k})}}}, i=1,. . . .,n.
\end{equation}
With the Gauss-Seidel iteration, we have:
+%\begin{equation}
+%\label{eq:Aberth-H-GS}
+%EAGS: z^{k+1}_{i}=\frac{p(z^{k}_{i})}{p'(z^{k}_{i})-p(z^{k}_{i})(\sum^{i-1}_{j=1}\frac{1}{z^{k}_{i}-z^{k+1}_{j}}+\sum^{n}_{j=i+1}\frac{1}{z^{k}_{i}-z^{k}_{j}})}, i=1,...,n.
+%\end{equation}
+
\begin{equation}
\label{eq:Aberth-H-GS}
-EAGS: z^{k+1}_{i}=\frac{p(z^{k}_{i})}{p'(z^{k}_{i})-p(z^{k}_{i})(\sum^{i-1}_{j=1}\frac{1}{z^{k}_{i}-z^{k+1}_{j}}+\sum^{n}_{j=i+1}\frac{1}{z^{k}_{i}-z^{k}_{j}})}, i=1,...,n.
+EAGS: z^{k+1}_{i}=z_{i}^{k}-\frac{\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}}
+{1-\frac{p(z_{i}^{k})}{p'(z_{i}^{k})}(\sum^{i-1}_{j=1}\frac{1}{z^{k}_{i}-z^{k+1}_{j}}+\sum_{j=1,j\neq i}^{j=n}{\frac{1}{(z_{i}^{k}-z_{j}^{k})}})}, i=1,. . . .,n
\end{equation}
-%%Here a finiched my revision %%
+
Using Eq.~\ref{eq:Aberth-H-GS} to update the vector solution
\textit{Z}, we expect the Gauss-Seidel iteration to converge more
quickly because, just as any Jacobi algorithm (for solving linear systems of equations), it uses the most fresh computed roots $z^{k+1}_{i}$.
In future works, we plan to investigate the possibility of using
several multiple GPUs simultaneously, either with multi-GPU machine or
-with cluster of GPUs.
+with cluster of GPUs. It may also be interesting to study the
+implementation of other root finding polynomial methods on GPU.