\emph{global memory} shared by all its multiprocessors. The new architectures (Fermi, Kepler,
etc) have also L1 and L2 caches to improve the accesses to the global memory.
-NVIDIA has released the CUDA platform (Compute Unified Device Architecture)~\cite{Nvi10}
+NVIDIA has released the CUDA platform (Compute Unified Device Architecture)~\cite{ref19}
which provides a high level GPGPU-based programming language (General-Purpose computing
on GPUs), allowing to program GPUs for general purpose computations. In CUDA programming
environment, all data-parallel and compute intensive portions of an application running