From: zianekhodja <zianekhodja.lilia@gmail.com>
Date: Sat, 16 Jan 2016 15:47:58 +0000 (+0100)
Subject: new
X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/kahina_paper2.git/commitdiff_plain/254d68992f882593d5924b2cd3e97de2fa251051?ds=sidebyside;hp=-c

new
---

254d68992f882593d5924b2cd3e97de2fa251051
diff --git a/paper.tex b/paper.tex
index 41c5960..00671de 100644
--- a/paper.tex
+++ b/paper.tex
@@ -736,9 +736,7 @@ investigated. \LZK{RÃ©pÃ©tition! Le mÃªme texte est dÃ©jÃ  Ã©crit comme intro da
 
 Like any parallel code, a GPU parallel implementation first requires to determine the sequential code and the data-parallel operations of a algorithm. In fact, all the operations that are easy to execute in parallel must be made by the GPU to accelerate the execution, like the steps 3 and 4. On the other hand, all the sequential operations and the operations that have data dependencies between CUDA threads or recursive computations must be executed by only one CUDA thread or a CPU thread (the steps 1 and 2).\LZK{La mÃ©thode est dÃ©jÃ  mal prÃ©sentÃ©e, dans ce cas c'est encore plus difficile de comprendre que reprÃ©sentent ces diffÃ©rentes Ã©tapes!} Initially, we specify the organization of parallel threads by specifying the dimension of the grid \verb+Dimgrid+, the number of blocks per grid \verb+DimBlock+ and the number of threads per block.
 
-The code is organized kernels which are part of code that are run on
-GPU devices. For step 3, there are two kernels, the first named
-\textit{save} is used to save vector $Z^{K-1}$ and the second one is
+The code is organized as kernels which are parts of code that are run on GPU devices. For step 3, there are two kernels, the first is named \textit{save} is used to save vector $Z^{K-1}$ and the second one is
 named \textit{update} and is used to update the $Z^{K}$ vector. For
 step 4, a kernel tests the convergence of the method. In order to
 compute the function H, we have two possibilities: either to use the
@@ -757,6 +755,7 @@ comes in particular from the fact that it is very difficult to debug
 CUDA running threads like threads on a CPU host. In the following
 paragraph Algorithm~\ref{alg1-cuda} shows the GPU parallel
 implementation of Ehrlich-Aberth method.
+\LZK{Vaut mieux expliquer l'implÃ©mentation en faisant rÃ©fÃ©rence Ã  l'algo sÃ©quentiel que de parler des diffÃ©rentes steps.}
 
 \begin{algorithm}[htpb]
 \label{alg1-cuda}