new

[book_gpu.git] / BookGPU / Chapters / chapter1 / ch1.tex
diff --git a/BookGPU/Chapters/chapter1/ch1.tex b/BookGPU/Chapters/chapter1/ch1.tex

index a70f420c51a3cbee1a4e7935b2fb1295ffe73bda..9c3d8af900e764ebde20e2419476ffc5783aa4b2 100755 (executable)
--- a/BookGPU/Chapters/chapter1/ch1.tex
+++ b/BookGPU/Chapters/chapter1/ch1.tex
@@ -5,7 +5,7 @@
  \label{chapter1}
  
  \section{Introduction}\label{ch1:intro}
  \label{chapter1}
  
  \section{Introduction}\label{ch1:intro}
-
+``test"  "test" ``test''
  This chapter introduces the Graphics  Processing Unit (GPU) architecture and all
  the concepts needed to understand how GPUs  work and can be used to speed up the
  execution of some algorithms. First of all this chapter gives a brief history of
  This chapter introduces the Graphics  Processing Unit (GPU) architecture and all
  the concepts needed to understand how GPUs  work and can be used to speed up the
  execution of some algorithms. First of all this chapter gives a brief history of
@@ -68,13 +68,13 @@ example  we can  cite, FireStream  by ATI  which is  not maintained  anymore and
  has been replaced by  OpenCL, BrookGPU by  Standford University~\cite{ch1:Buck:2004:BGS}.
  Another environment based  on pragma (insertion of pragma  directives inside the
  code to  help the compiler to generate  efficient code) is called  OpenACC.  For a
  has been replaced by  OpenCL, BrookGPU by  Standford University~\cite{ch1:Buck:2004:BGS}.
  Another environment based  on pragma (insertion of pragma  directives inside the
  code to  help the compiler to generate  efficient code) is called  OpenACC.  For a
-comparison with OpenCL, interested readers may refer to~\cite{ch1:CMR:12}.
+comparison with OpenCL, interested readers may refer to~\cite{ch1:Dongarra}.
  
  
  
  \section{Architecture of current GPUs}
  
  
  
  
  \section{Architecture of current GPUs}
  
-The architecture  \index{architecture of  a GPU} of  current GPUs  is constantly
+The architecture  \index{GPU!architecture of a} of  current GPUs  is constantly
  evolving.  Nevertheless  some trends remain constant  throughout this evolution.
  Processing units composing a GPU are  far simpler than a traditional CPU and
  it is much easier to integrate many computing units inside a GPU card than to do
  evolving.  Nevertheless  some trends remain constant  throughout this evolution.
  Processing units composing a GPU are  far simpler than a traditional CPU and
  it is much easier to integrate many computing units inside a GPU card than to do
@@ -113,7 +113,7 @@ Threads are used to  benefit from the large number of cores  of a GPU. These
  threads    are   different    from    traditional   threads    for a   CPU.     In
  Chapter~\ref{chapter2},  some  examples of  GPU  programming  will explain  the
  details of  the GPU  threads. Threads are gathered  into blocks  of 32
  threads    are   different    from    traditional   threads    for a   CPU.     In
  Chapter~\ref{chapter2},  some  examples of  GPU  programming  will explain  the
  details of  the GPU  threads. Threads are gathered  into blocks  of 32
-threads, called warps. These warps  are important when designing an algorithm
+threads, called ``warps''. These warps  are important when designing an algorithm
  for GPU.
  
  
  for GPU.
  
  
@@ -232,11 +232,11 @@ will explain that.
  
  \section{Memory hierarchy}
  
  
  \section{Memory hierarchy}
  
-The memory hierarchy of  GPUs\index{memory~hierarchy} is different from that of CPUs.  In practice,  there are registers\index{memory~hierarchy!registers}, local
-memory\index{memory~hierarchy!local~memory},                               shared
-memory\index{memory~hierarchy!shared~memory},                               cache
-memory\index{memory~hierarchy!cache~memory},              and              global
-memory\index{memory~hierarchy!global~memory}.
+The memory hierarchy of  GPUs\index{memory hierarchy} is different from that of CPUs.  In practice,  there are registers\index{memory hierarchy!registers}, local
+memory\index{memory hierarchy!local memory},                               shared
+memory\index{memory hierarchy!shared memory},                               cache
+memory\index{memory hierarchy!cache memory},              and              global
+memory\index{memory hierarchy!global memory}.
  
  
  As  previously  mentioned each  thread  can access  its  own  registers.  It  is
  
  
  As  previously  mentioned each  thread  can access  its  own  registers.  It  is