Chapter 1: Presentation of the GPU architecture Author: Raphaël Couturier: University of Franche-Comte, France This chapter will introduce the GPU architecture and the classical model proposed by CUDA. All backgrounds necessary for the remainder of the book will be first presented here. Chapter 2: Simple examples with CUDA Author: Raphaël Couturier: University of Franche-Comte, France **** Part : image processing Chapter 3: Fast kernels for image and signal processing Authors: Gilles Perrot, Raphaël Couturier and Stéphane Domas: University of Franche-Comte, France, Nicolas Bertaux, University of Aix-Marseille, France In this chapter, we will introduce and present many kernels that can drastically enhance signal and image processing algorithms. Although these kernels seem to be very common, they have not yet been well described in the literature. Chapter 4: Region Based Algorithm for Large Images Segmentation on GPU Authors: Gilles Perrot, Raphaël Couturier and Stéphane Domas: University of Franche-Comte, France, Nicolas Bertaux, University of Aix-Marseille, France In this chapter, we will present an algorithm for region-based active contour techniques (snakes) as they seem to achieve a high level of robustness and fit with a large range of applications. **** Part : Software Development Chapter 5: On the development of high-performance software library for emerging architectures: design and analysis Authors: Allan P. Engsig-Karup, Bernd Dammann, Jeppe R. Frisvad and Stefan Lemvig: Technical University of Denmark, Denmark This chapter will present performance portable tuning techniques via a modern parallel programming. Then it will focus on efficient and scalable iterative methods for solution of high-order numerical methods and strategies for efficient implementations on desktop architectures. Chapter 6: Pertinence and development methodologies for GPU and cluster of GPU Authors: Sylvain Contassot-Vivier: University of Nancy, France, Stéphane Vialle, Supélec, Metz, France This chapter proposes to draw the main frontiers of the fields of applicability of GPU acceleration as well as development methodologies to obtain efficient codes in classical scientific applications. Chapter 7: Fast GPU-accelerated desktop application Authors: Allan P. Engsig-Karup, Bernd Dammann, Jeppe R. Frisvad and Stefan Lemvig: Technical University of Denmark, Denmark This chapter will present discussions, analysis and highlights of a new massively parallel engineering tool for nonlinear free surface flows intended for both engineering analysis and interactive real-time computing, e.g. for applications in coastal and offshore engineering and first of its kind physics-based ship simulation. **** Part : Optimization Chapter 8: GPU-accelerated Tree-based Exact Optimization Methods Authors: Imen Chakroun, Nouredine Melab and El-Ghazali Talbi: INRIA Lille, France This chapter will present the latest techniques and algorithms for solving tree-based exact optimization methods on GPU. Chapter 9: Parallel Meta-heuristics for Solving Challenging Problems on GPU Accelerators Authors: Thé Van Luong, Nouredine Melab and El-Ghazali Talbi: INRIA Lille, France This chapter will describe parallel metaheuristics for solving complex problems in science and industry. This work is based on local search metaheuristics. Chapter 10: Linear programming on a GPU: a study case based on the simplex method and the branch-cut-and bound algorithm Authors: Paul Albuquerque: HES-SO, Geneva, Switzerland, Xavier Meyer and Bastien Chopard: University of Geneva, Switzerland This chapter will address the main issues related to programming the simplex method on a GPU. Then it will present how to integrate this GPU-based simplex method in a branch-cut-and-bound framework which will take place between the CPU and the GPU. Chapter 11: Performing large scale robust regression on GPUs Authors: Gleb Beliakov and Gang Li: Deakin University, Melbourne, Australia In this chapter we will report on the use of GPUs for large scale robust data analysis. Identification of outliers in large multivariate data sets is difficult, because outliers shift regression models in their direction so much that they become undetectable by their residuals. ***** Part : Numerical applications Chapter 12: Sparse linear system solvers with the GMRES method on gpu clusters Authors: Lilia Ziane Khodja, Raphaël Couturier and Jacques Bahi: University of Franche Comte, France In this chapter, the adaptation of the GMRES method will be presented and several techniques (compression, partitioning …) allowing to increase the scalability of this algorithm for GPU cluster will be described. Chapter 13: Parallel solution of the Obstacle problem on GPU clusters Authors: Lilia Ziane Khodja, Raphaël Couturier and Jacques Bahi: University of Franche Comte, France, Ming Chau and Pierre Spiteri: University of Toulouse, France This chapter is devoted to the implementation of the Obstable problem on GPU clusters. This problem is a non linear PDE occurring in financial mathematics (option pricing) and constrained structure mechanics. Synchronous and asynchronous implementations will be analyzed. Chapter 14: Complex fluid lattice Boltzmann on GPU clusters Authors: Kevin Stratford and Alan Gray: University of Edinburg, United Kingdom This chapter will present a complex fluid lattice Boltzmann application such that it can scale and perform excellently on large-scale GPU clusters. Chapter 15: Deployment on GPU of an atomic physics program Authors: Pierre Fortin, Rachid Habel, Fabienne Jézéquel and Jean-Luc Lamotte: University of Paris 6, France Stan Scott This chapter will describe the deployment on GPUs of PROP, a program of the 2DRMP suite which models electron collisions with H-like atoms and ions. Chapter 16: GPU-based envelop-follow simulation techniques for power converters design Authors: Sheldon Tan + students: University of California, Riverside, USA This chapter will introduce a new envelope-following parallel transient analysis method for the general switching power converters. This method exploits the parallelism in the envelope-following method and parallelize the Newton update solving part, which is the most computational expensive, in GPU platforms to boost the simulation performance. Chapter 17: Domain decomposition method on GPU architecture Authors: Frédéric Magoules: Ecole centrale, Paris, France This chapter will present how GPU architecture can increase performances of domain decomposition methods. **** Part Other Chapter 18: Pseudo Random Number Generator on GPU Authors: Raphaël Couturier and Christophe Guyeux: University of Franche-Comte, France This chapter will present some pseudo random number generators which are essential in many applications. We have proposed a generator which has chaotic properties which are proved, whereas it is not the case for other generators. Our generator succeeds to pass all statistical battery series. Chapter 19: Solving large sparse linear systems for integer factorization on GPUs Authors: Bertil Schmidt and Hao Yu Dang: University of Mainz, Germany This chapter will present the number field sieve (NFS) which is the current state-of-the-art integer factorization method. It will focus on how GPUs can be used to accelerate this highly time consuming operation.