Modif

author Kahina <kahina@kahina-VPCEH3K1E.(none)>

Mon, 2 May 2016 07:21:24 +0000 (09:21 +0200)

committer Kahina <kahina@kahina-VPCEH3K1E.(none)>

Mon, 2 May 2016 07:21:24 +0000 (09:21 +0200)
author Kahina <kahina@kahina-VPCEH3K1E.(none)>
Mon, 2 May 2016 07:21:24 +0000 (09:21 +0200)
committer Kahina <kahina@kahina-VPCEH3K1E.(none)>
Mon, 2 May 2016 07:21:24 +0000 (09:21 +0200)
diff --git a/maj.tex b/maj.tex

index 2b24a34278f628c98173d09c8007ed2d2df7fb95..89dccc26b1f9ab994e5006814cf3e07b18ba14c4 100644 (file)
--- a/maj.tex
+++ b/maj.tex
@@ -585,7 +585,7 @@ polynomials of 48,000.
  %% In this sequential algorithm, one CPU thread  executes all the steps. Let us look to the $3^{rd}$ step i.e. the execution of the iterative function, 2 sub-steps are needed. The first sub-step \textit{save}s the solution vector of the previous iteration, the second sub-step \textit{update}s or computes the new values of the roots vector.
  
  \subsection{Parallel implementation with CUDA }
-
+\KG{cette section a totalement modifié}
  %In order to implement the Ehrlich-Aberth method in CUDA, it is
  %possible to use the Jacobi scheme or the Gauss-Seidel one.  With the
  %Jacobi iteration, at iteration $k+1$ we need all the previous values
@@ -731,8 +731,7 @@ We notice that the update kernel is called in two forms, according to the value
  If the modulus of the current complex is less than a given value called the
  radius i.e. ($ |z^{k}_{i}|<= R$), then the classical form of the EA
  function Eq.~\ref{Eq:Hi} is executed, else the EA.EL function Eq.~\ref{Log_H2} is executed.
-(with Eq.~\ref{deflncomplex}, Eq.~\ref{defexpcomplex}). The radius $R$ is evaluated as in Eq.~\ref{R.EL}. \KG{experimentally it is difficult to solve the high degree polynomial with the classical Ehrlich-Aberth method, so if the root are under to the unit circle ($R$)the kernel \textit{update} is called in the EA.EL function Eq.~\ref{Log_H2} (with Eq.~\ref{deflncomplex}, Eq.~\ref{defexpcomplex}) line 3, to put into account the limited of grander floating manipulated by processors and compute more roots}. 
-\KG{} We notice that we used \verb= cuDoubleComplex= to exploit the complex number in CUDA, and the functions of the CUBLAS library to implement some vector operations on the GPU. We use the following functions:
+(with Eq.~\ref{deflncomplex}, Eq.~\ref{defexpcomplex}). The radius $R$ is evaluated as in Eq.~\ref{R.EL}. \KG{experimentally it is difficult to solve the high degree polynomial with the classical Ehrlich-Aberth method, so if the root are under to the unit circle ($R$)the kernel \textit{update} is called in the EA.EL function Eq.~\ref{Log_H2} (with Eq.~\ref{deflncomplex}, Eq.~\ref{defexpcomplex}) line 3, to put into account the limited of grander floating manipulated by processors and compute more roots}. We notice that we used \verb= cuDoubleComplex= to exploit the complex number in CUDA, and the functions of the CUBLAS library to implement some vector operations on the GPU. We use the following functions:
  
  \begin{itemize}
  \item \verb= cublasIdamax()= for the 
@@ -764,12 +763,12 @@ This kernel is executed by a large number of GPU threads such that each thread i
  
  Each GPU threads in grid compute one root en parallel, if the polynomial size exceed the capacity of the grid the G.S schema are finely executed, like the grid can only compute << Blocks,Threads>> roots at the same time, if we need to compute more roots, the grid can used the roots previously executed to compute other root ih the same iteration, like the following schema:
  
-%\begin{figure}[htbp]
-%\centering
- % \includegraphics[width=0.8\textwidth]{figures/G.S}
-%\caption{Gauss Seidel iteration}
-%\label{fig:08}
-
+\begin{figure}[htbp]
+\centering
+  \includegraphics[width=0.8\textwidth]{figures/GS}
+\caption{Gauss Seidel iteration}
+\label{fig:08}
+\end{figure}
  
  The last kernel checks the convergence of the roots after each update
  of $Z^{k}$, according to formula Eq.~\ref{eq:Aberth-Conv-Cond}. We used the functions of the CUBLAS Library (CUDA Basic Linear Algebra Subroutines) to implement this kernel.
author	Kahina <kahina@kahina-VPCEH3K1E.(none)>
	Mon, 2 May 2016 07:21:24 +0000 (09:21 +0200)
committer	Kahina <kahina@kahina-VPCEH3K1E.(none)>
	Mon, 2 May 2016 07:21:24 +0000 (09:21 +0200)