From 713ebab6db0c8eb9899206e1f6354a4777c6feaf Mon Sep 17 00:00:00 2001 From: asider Date: Wed, 21 Oct 2015 08:57:10 +0100 Subject: [PATCH 1/1] =?utf8?q?Relu=20tout=20=C3=A0=20part=20la=20section?= =?utf8?q?=20Expermiental=20study?= MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit --- paper.tex | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/paper.tex b/paper.tex index 104bcbe..c68c068 100644 --- a/paper.tex +++ b/paper.tex @@ -232,6 +232,7 @@ The initialization of a polynomial p(z) is done by setting each of the $n$ compl : \begin{equation} +\label{eq:SimplePolynome} p(z)=\sum{a_{i}z^{n-i}} , a_{n} \neq 0,a_{0}=1, a_{i}\subset C \end{equation} @@ -248,6 +249,7 @@ performed this choice by selecting complex numbers along different circles and relies on the result of~\cite{Ostrowski41}. \begin{equation} +\label{eq:radiusR} %%\begin{align} \sigma_{0}=\frac{u+v}{2};u=\frac{\sum_{i=1}^{n}u_{i}}{n.max_{i=1}^{n}u_{i}}; v=\frac{\sum_{i=0}^{n-1}v_{i}}{n.min_{i=0}^{n-1}v_{i}};\\ @@ -568,14 +570,14 @@ $kernel\_update\_Log(d\_z^{k})$\; } \end{algorithm} -The first form executes the formula (8) if the modulus is of the current complex is less than the radius i.e. ($ |z^{k}_{i}|<= R$), else the kernel executes formulas (13,14). The radius R is evaluated as : +The first form executes formula \ref{eq:SimplePolynome} if the modulus of the current complex is less than the a certain value called the radius i.e. ($ |z^{k}_{i}|<= R$), else the kernel executes formulas (Eq.~\ref{deflncomplex},Eq.~\ref{defexpcomplex}). The radius $R$ is evaluated as : $$R = \exp( \log(DBL\_MAX) / (2*n) )$$ where $DBL\_MAX$ stands for the maximum representable double value. The last kernel verifies the convergence of the roots after each update of $Z^{(k)}$, according to formula. We used the functions of the CUBLAS Library (CUDA Basic Linear Algebra Subroutines) to implement this kernel. -The kernels terminates it computations when all the root are converged. Finally, the solution of the root finding problem is copied back from the GPU global memory to the CPU memory. We use the communication functions of CUDA for the memory allocations in the GPU \verb=(cudaMalloc())= and the data transfers from the CPU memory to the GPU memory \verb=(cudaMemcpyHostToDevice)= -or from the GPU memory to the CPU memory \verb=(cudaMemcpyDeviceToHost))=. +The kernels terminate it computations when all the roots converge. Finally, the solution of the root finding problem is copied back from GPU global memory to CPU memory. We use the communication functions of CUDA for the memory allocation in the GPU \verb=(cudaMalloc())= and for data transfers from the CPU memory to the GPU memory \verb=(cudaMemcpyHostToDevice)= +or from GPU memory to CPU memory \verb=(cudaMemcpyDeviceToHost))=. %%HIER END MY REVISIONS (SIDER) \subsection{Experimental study} -- 2.39.5