X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/book_gpu.git/blobdiff_plain/17bff40b83bcdcc39769f9e59c70ffae1c525b72..b1fd489e34a8d46d286a0d271c38cbfb442f511f:/BookGPU/Chapters/chapter6/PartieSync.tex?ds=inline diff --git a/BookGPU/Chapters/chapter6/PartieSync.tex b/BookGPU/Chapters/chapter6/PartieSync.tex index d8d281c..ce21565 100755 --- a/BookGPU/Chapters/chapter6/PartieSync.tex +++ b/BookGPU/Chapters/chapter6/PartieSync.tex @@ -97,7 +97,7 @@ parallel programming schemes on a GPU cluster: Using CUDA\index{CUDA}, GPU kernel executions are nonblocking, and GPU/CPU data transfers\index{CUDA!data transfer} are blocking or nonblocking operations. All GPU kernel executions and CPU/GPU -data transfers are associated to "streams,"\index{CUDA!stream} and all operations on a same stream +data transfers are associated to ``streams'',\index{CUDA!stream} and all operations on a same stream are serialized. When transferring data from the CPU to the GPU, then running GPU computations, and finally transferring results from the GPU to the CPU, there is a natural synchronization and serialization if these operations are achieved on @@ -489,7 +489,7 @@ working on independent subsets of data. \Lst{algo:ch6p1overlapstreamsequence} is not so generic as \Lst{algo:ch6p1overlapseqsequence}. -\subsection{Interleaved communications-transfers-computations\\overlapping} +\subsection{Interleaved communications-transfers-computations overlapping} Many algorithms do not support splitting data transfers and kernel calls, and cannot exploit CUDA streams, for example, when each GPU thread requires access to