The MPI (Message Passing Interface) library allows to create computer programs that run on a distributed memory architecture. The various processes have their own environment of execution and execute their code in a asynchronous way, according to the MIMD model (Multiple Instruction streams, Multiple Data streams); they communicate and synchronise by exchanging messages~\cite{Peter96}. MPI messages are explicitly sent, while the exchanges are implicit within the framework of a multi-thread programming environment like OpenMP or Pthreads.
\subsection{CUDA}%L'article en anglais Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications
-CUDA (an acronym for Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA~\cite{NVIDIA12}. The
+CUDA (an acronym for Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA~\cite{CUDA10}. The
unit of execution in CUDA is called a thread. Each thread executes a kernel by the streaming processors in parallel. In CUDA,
a group of threads that are executed together is called a thread block, and the computational grid consists of a grid of thread
blocks. Additionally, a thread block can use the shared memory on a single multiprocessor while the grid executes a single