1 \section{Introduction}\label{ch6:intro}
3 This chapter proposes to draw upon several development methodologies to obtain
4 efficient codes in classical scientific applications. Those methodologies are
5 based on the feedback from several research works involving GPUs, either in a
6 single machine or in a cluster of machines. Indeed, our past collaborations
7 with industries have allowed us to point out that in their economical context,
8 they can adopt a parallel technology only if its implementation and maintenance
9 costs are small compared with the potential benefits (performance,
10 accuracy, etc.). So, in such contexts, GPU programming is still regarded with
11 some distance due to its specific field of applicability (SIMD/SIMT model:
12 Single Instruction Multiple Data/Thread) and its still higher programming
13 complexity and maintenance. In the academic domain, things are a bit different,
14 but studies for efficiently integrating GPU computations in multicore clusters
15 with maximal overlapping of computations with communications and/or other
16 computations are still rare.
18 For these reasons, the major aim of that chapter is to propose general
19 programming patterns, as simple as possible, that can be followed or adapted in
20 practical implementations of parallel scientific applications.
21 % Also, according to our experience in industrial collaborations, we propose a
22 % small prospect analysis about the perenity of such accelerators in the
24 In addition, we propose a prospect analysis together with a particular
25 programming tool that is intended to ease multicore GPU cluster programming.
31 %%% ispell-dictionary: "american"
33 %%% TeX-master: "../../BookGPU.tex"