From: Kahina Date: Sun, 20 Dec 2015 06:58:21 +0000 (+0100) Subject: the EA implementation on (CUDA, OpenMP) X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/kahina_paper2.git/commitdiff_plain/2aba789f5e740bc30d29b14c1761f917c8f83108?ds=inline the EA implementation on (CUDA, OpenMP) --- diff --git a/paper.tex b/paper.tex index 962c7f9..88d8823 100644 --- a/paper.tex +++ b/paper.tex @@ -565,8 +565,6 @@ Algorithm~\ref{alg2-cuda} shows a sketch of the Ehrlich-Aberth method using CUDA \section{The EA algorithm on Multi-GPU} \subsection{MGPU (OpenMP-CUDA)approach} -Before beginning the calculation, our implementation parallel with OpenMP and CUDA shares the input data between threads OpenMP, these input data sotn Z: the vector solution, P: the polynomial to solve, - Before starting computations, our parallel implementation shared input data of the root finding polynomial between OpenMP threads. From Algorithm 1, the input data are the solution vector $Z$, the polynomial to solve $P$. Let number of OpenMP threads is equal to the number of GPUs, each threads OpenMP ( T-omp) checks one GPU, and control a part of the shared memory, that is a part of the vector Z like: $(n/Nbr_gpu)$ roots, n: the polynomial's degrees, $Nbr_gpu$ the number of GPUs. Then every GPU will have a grid of computation organized with its performances and the size of data of which it checks. In principle a grid is set by two parameter DimGrid, the number of block per grid, DimBloc: the number of threads per block. The following schema shows the architecture of (CUDA,OpenMP).