\sum_{i\neq j}^{n}\frac{1}{z^{k}_{i}-z^{k}_{j}}\right)\right)i=1,...,n,
\end{equation}
-This solution is applied when the root except the circle unit, represented by the radius $R$ evaluated in C language as:
-
+This solution is applied when the root except the circle unit, represented by the radius $R$ evaluated in C language as :
+\label{R.EL}
+\begin{center}
\begin{verbatim}
R = exp(log(DBL_MAX)/(2*n) );
-\end{verbatim}
+\end{verbatim}
+\end{center}
%\begin{equation}
%In CUDA programming, all the instructions of the \verb=for= loop are executed by the GPU as a kernel. A kernel is a function written in CUDA and defined by the \verb=__global__= qualifier added before a usual \verb=C= function, which instructs the compiler to generate appropriate code to pass it to the CUDA runtime in order to be executed on the GPU.
-Algorithm~\ref{alg2-cuda} shows a sketch of the Ehrlich-Aberth algorithm using CUDA.
+Algorithm~\ref{alg2-cuda} shows steps of the Ehrlich-Aberth algorithm using CUDA.
+\begin{enumerate}
\begin{algorithm}[H]
\label{alg2-cuda}
%\LinesNumbered
\BlankLine
-Initialization of the of P\;
-Initialization of the of Pu\;
-Initialization of the solution vector $Z^{0}$\;
-Allocate and copy initial data to the GPU global memory\;
-k=0\;
+\item Initialization of the of P\;
+\item Initialization of the of Pu\;
+\item Initialization of the solution vector $Z^{0}$\;
+\item Allocate and copy initial data to the GPU global memory\;
+\item k=0\;
\While {$\Delta z_{max} > \epsilon$}{
- Let $\Delta z_{max}=0$\;
-$ kernel\_save(ZPrec,Z)$\;
-k=k+1\;
-$ kernel\_update(Z,P,Pu)$\;
-$kernel\_testConverge(\Delta z_{max},Z,ZPrec)$\;
+\item Let $\Delta z_{max}=0$\;
+\item $ kernel\_save(ZPrec,Z)$\;
+\item k=k+1\;
+\item $ kernel\_update(Z,P,Pu)$\;
+\item $kernel\_testConverge(\Delta z_{max},Z,ZPrec)$\;
}
-Copy results from GPU memory to CPU memory\;
+\item Copy results from GPU memory to CPU memory\;
\end{algorithm}
+\end{enumerate}
~\\
After the initialization step, all data of the root finding problem
order to check the convergence of the roots after each iteration (line
8, Algorithm~\ref{alg2-cuda}).
-The second kernel executes the iterative function $H$ and updates
-Z, according to Algorithm~\ref{alg3-update}. We notice that the
+The second kernel executes the iterative function and updates
+$Z$, according to Algorithm~\ref{alg3-update}. We notice that the
update kernel is called in two forms, according to the value
\emph{R} which determines the radius beyond which we apply the
exponential logarithm algorithm.
of the current complex is less than the a certain value called the
radius i.e. ($ |z^{k}_{i}|<= R$), else the kernel executes the EA.EL
function Eq.~\ref{Log_H2}
-(with Eq.~\ref{deflncomplex}, Eq.~\ref{defexpcomplex}). The radius $R$ is evaluated as :
+(with Eq.~\ref{deflncomplex}, Eq.~\ref{defexpcomplex}). The radius $R$ is evaluated as in ~\ref{R.EL} :
$$R = \exp( \log(DBL\_MAX) / (2*n) )$$ where $DBL\_MAX$ stands for the maximum representable double value.