X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/book_gpu.git/blobdiff_plain/8fc4c8914177b38f8042870c31065ea619b900ec..HEAD:/BookGPU/Chapters/chapter9/ch9.tex?ds=inline diff --git a/BookGPU/Chapters/chapter9/ch9.tex b/BookGPU/Chapters/chapter9/ch9.tex index 2d44381..442c53a 100644 --- a/BookGPU/Chapters/chapter9/ch9.tex +++ b/BookGPU/Chapters/chapter9/ch9.tex @@ -1,10 +1,10 @@ -\chapterauthor{Malika Mehdi and Ahc\`{e}ne Bendjoudi}{CERIST Research Center, DTISI, 3 rue des frères Aissou, 16030 Ben-Aknoun, Algiers, Algeria} +\chapterauthor{Malika Mehdi and Ahc\`{e}ne Bendjoudi}{CERIST Research Center, Algiers, Algeria} \chapterauthor{Lakhdar Loukil}{University of Oran, Algeria} %\chapterauthor{Ahc\`{e}ne Bendjoudi}{CERIST Research Center, DTISI, 3 rue des frères Aissou, 16030 Ben-Aknoun, Algiers, Algeria} -\chapterauthor{Nouredine Melab}{Université Lille 1, LIFL/UMR CNRS 8022, 59655-Villeneuve d'Ascq cedex, France} +\chapterauthor{Nouredine Melab}{University of Lille 1, CNRS/LIFL/INRIA, France} \chapter{Parallel GPU-accelerated metaheuristics} \label{chapter9} @@ -99,7 +99,7 @@ solutions. The process is repeated until a stopping criterion is satisfied. \emph{Evolutionary algorithms}, \emph{swarm optimization}, and \emph{ant colonies} fall into this class. - +%\clearpage \section{Parallel models for metaheuristics}\label{ch8:sec:paraMeta} Optimization problems, whether real-life or academic, are more often NP-hard and CPU time and/or memory consuming. Metaheuristics @@ -187,8 +187,8 @@ consuming. Unlike the two previous parallel models, the solution-level\index{metaheuristics!solution-level parallelism} parallel model is problem-dependent.} \end{itemize} -\clearpage -\section{Challenges for the design of GPU-based metaheuristics} +%\clearpage +\section[Challenges for the design of GPU-based metaheuristics]{Challenges for the design of GPU-based\hfill\break metaheuristics} \label{ch8:sec:challenges} Developing GPU-based parallel @@ -251,7 +251,7 @@ data (e.g., data for fitness evaluation that all threads concurrently access) on the constant memory, and the most accessed data structures (e.g., population of individuals for a CUDA thread block) on the shared memory. - +\clearpage \subsection{Threads synchronization} \index{GPU!threads synchronization} The thread synchronization issue is caused by both the GPU architecture and @@ -501,7 +501,7 @@ QAPLIB~\cite{burkard1991qaplib}. Speedups up to $10 \times$ are achieved by the GPU implementation compared to the same sequential implementation on CPU using SA-matrix. -\subsection[Implementing population-based metaheuristics\hfill\break on GPUs]{Implementing population-based metaheuristics on GPUs} +\subsection[Implementing population-based metaheuristics on GPUs]{Implementing population-based metaheuristics on GPUs} State-of-the-art works dealing with the implementation of p-metaheuristics on GPUs generally rely on parallel models and @@ -1026,7 +1026,7 @@ it is sent back to the CPU which selects the best solution (See Figure~\ref{ch1:fig:paradiseoGPU}). \subsection{libCUDAOptimize: an open source library of GPU-based metaheuristics} -\index{GPU-based frameworks!libCudaOptimize} +\index{GPU-based frameworks!libCUDAOptimize} LibCUDAOptimize~\cite{libcuda} is a C++/CUDA open source library for the design and implementation of metaheuristics on GPUs. Until now, the metaheuristics supported by LibCUDAOptimize are: scatter