This chapter introduces the Graphics Processing Unit (GPU) architecture and all
the concepts needed to understand how GPUs work and can be used to speed up the
execution of some algorithms. First of all this chapter gives a brief history of
This chapter introduces the Graphics Processing Unit (GPU) architecture and all
the concepts needed to understand how GPUs work and can be used to speed up the
execution of some algorithms. First of all this chapter gives a brief history of
has been replaced by OpenCL, BrookGPU by Standford University~\cite{ch1:Buck:2004:BGS}.
Another environment based on pragma (insertion of pragma directives inside the
code to help the compiler to generate efficient code) is called OpenACC. For a
has been replaced by OpenCL, BrookGPU by Standford University~\cite{ch1:Buck:2004:BGS}.
Another environment based on pragma (insertion of pragma directives inside the
code to help the compiler to generate efficient code) is called OpenACC. For a
evolving. Nevertheless some trends remain constant throughout this evolution.
Processing units composing a GPU are far simpler than a traditional CPU and
it is much easier to integrate many computing units inside a GPU card than to do
evolving. Nevertheless some trends remain constant throughout this evolution.
Processing units composing a GPU are far simpler than a traditional CPU and
it is much easier to integrate many computing units inside a GPU card than to do
threads are different from traditional threads for a CPU. In
Chapter~\ref{chapter2}, some examples of GPU programming will explain the
details of the GPU threads. Threads are gathered into blocks of 32
threads are different from traditional threads for a CPU. In
Chapter~\ref{chapter2}, some examples of GPU programming will explain the
details of the GPU threads. Threads are gathered into blocks of 32
-The memory hierarchy of GPUs\index{memory~hierarchy} is different from that of CPUs. In practice, there are registers\index{memory~hierarchy!registers}, local
-memory\index{memory~hierarchy!local~memory}, shared
-memory\index{memory~hierarchy!shared~memory}, cache
-memory\index{memory~hierarchy!cache~memory}, and global
-memory\index{memory~hierarchy!global~memory}.
+The memory hierarchy of GPUs\index{memory hierarchy} is different from that of CPUs. In practice, there are registers\index{memory hierarchy!registers}, local
+memory\index{memory hierarchy!local memory}, shared
+memory\index{memory hierarchy!shared memory}, cache
+memory\index{memory hierarchy!cache memory}, and global
+memory\index{memory hierarchy!global memory}.