7 %% http://www.michaelshell.org/
8 %% for current contact information.
10 %% This is a skeleton file demonstrating the use of IEEEtran.cls
11 %% (requires IEEEtran.cls version 1.7 or later) with an IEEE conference paper.
14 %% http://www.michaelshell.org/tex/ieeetran/
15 %% http://www.ctan.org/tex-archive/macros/latex/contrib/IEEEtran/
17 %% http://www.ieee.org/
19 %%*************************************************************************
21 %% This code is offered as-is without any warranty either expressed or
22 %% implied; without even the implied warranty of MERCHANTABILITY or
23 %% FITNESS FOR A PARTICULAR PURPOSE!
24 %% User assumes all risk.
25 %% In no event shall IEEE or any contributor to this code be liable for
26 %% any damages or losses, including, but not limited to, incidental,
27 %% consequential, or any other damages, resulting from the use or misuse
28 %% of any information contained here.
30 %% All comments are the opinions of their respective authors and are not
31 %% necessarily endorsed by the IEEE.
33 %% This work is distributed under the LaTeX Project Public License (LPPL)
34 %% ( http://www.latex-project.org/ ) version 1.3, and may be freely used,
35 %% distributed and modified. A copy of the LPPL, version 1.3, is included
36 %% in the base LaTeX documentation of all distributions of LaTeX released
37 %% 2003/12/01 or later.
40 %% Retain all contribution notices and credits.
41 %% ** Modified files should be clearly indicated as such, including **
42 %% ** renaming them and changing author support contact information. **
44 %% File list of work: IEEEtran.cls, IEEEtran_HOWTO.pdf, bare_adv.tex,
45 %% bare_conf.tex, bare_jrnl.tex, bare_jrnl_compsoc.tex
46 %%*************************************************************************
48 % *** Authors should verify (and, if needed, correct) their LaTeX system ***
49 % *** with the testflow diagnostic prior to trusting their LaTeX platform ***
50 % *** with production work. IEEE's font choices can trigger bugs that do ***
51 % *** not appear when using other class files. ***
52 % The testflow support page is at:
53 % http://www.michaelshell.org/tex/testflow/
57 % Note that the a4paper option is mainly intended so that authors in
58 % countries using A4 can easily print to A4 and see how their papers will
59 % look in print - the typesetting of the document will not typically be
60 % affected with changes in paper size (but the bottom and side margins will).
61 % Use the testflow package mentioned above to verify correct handling of
62 % both paper sizes by the user's LaTeX system.
64 % Also note that the "draftcls" or "draftclsnofoot", not "draft", option
65 % should be used if it is desired that the figures are to be displayed in
68 \documentclass[12pt, journal, onecolumn]{IEEEtran}
69 % Add the compsocconf option for Computer Society conferences.
71 % If IEEEtran.cls has not been installed into the LaTeX system files,
72 % manually specify the path to it like:
73 % \documentclass[conference]{../sty/IEEEtran}
76 \usepackage[latin1]{inputenc}
77 \usepackage[cyr]{aeguill}
78 \usepackage[francais]{babel}
80 % Some very useful LaTeX packages include:
81 % (uncomment the ones you want to load)
84 % *** MISC UTILITY PACKAGES ***
87 % Heiko Oberdiek's ifpdf.sty is very useful if you need conditional
88 % compilation based on whether the output is pdf or dvi.
95 % The latest version of ifpdf.sty can be obtained from:
96 % http://www.ctan.org/tex-archive/macros/latex/contrib/oberdiek/
97 % Also, note that IEEEtran.cls V1.7 and later provides a builtin
98 % \ifCLASSINFOpdf conditional that works the same way.
99 % When switching from latex to pdflatex and vice-versa, the compiler may
100 % have to be run twice to clear warning/error messages.
107 % *** CITATION PACKAGES ***
110 % cite.sty was written by Donald Arseneau
111 % V1.6 and later of IEEEtran pre-defines the format of the cite.sty package
112 % \cite{} output to follow that of IEEE. Loading the cite package will
113 % result in citation numbers being automatically sorted and properly
114 % "compressed/ranged". e.g., [1], [9], [2], [7], [5], [6] without using
115 % cite.sty will become [1], [2], [5]--[7], [9] using cite.sty. cite.sty's
116 % \cite will automatically add leading space, if needed. Use cite.sty's
117 % noadjust option (cite.sty V3.8 and later) if you want to turn this off.
118 % cite.sty is already installed on most LaTeX systems. Be sure and use
119 % version 4.0 (2003-05-27) and later if using hyperref.sty. cite.sty does
120 % not currently provide for hyperlinked citations.
121 % The latest version can be obtained at:
122 % http://www.ctan.org/tex-archive/macros/latex/contrib/cite/
123 % The documentation is contained in the cite.sty file itself.
130 % *** GRAPHICS RELATED PACKAGES ***
132 \usepackage{transparent}
134 \usepackage[pdftex]{graphicx,color}
135 % declare the path(s) where your graphic files are
136 \graphicspath{{img/}}
137 % and their extensions so you won't have to specify these with
138 % every instance of \includegraphics
139 \DeclareGraphicsExtensions{.pdf,.jpeg,.png}
141 % or other class option (dvipsone, dvipdf, if not using dvips). graphicx
142 % will default to the driver specified in the system graphics.cfg if no
143 % driver is specified.
144 % \usepackage[dvips]{graphicx}
145 % declare the path(s) where your graphic files are
146 % \graphicspath{{../eps/}}
147 % and their extensions so you won't have to specify these with
148 % every instance of \includegraphics
149 % \DeclareGraphicsExtensions{.eps}
151 % graphicx was written by David Carlisle and Sebastian Rahtz. It is
152 % required if you want graphics, photos, etc. graphicx.sty is already
153 % installed on most LaTeX systems. The latest version and documentation can
155 % http://www.ctan.org/tex-archive/macros/latex/required/graphics/
156 % Another good source of documentation is "Using Imported Graphics in
157 % LaTeX2e" by Keith Reckdahl which can be found as epslatex.ps or
158 % epslatex.pdf at: http://www.ctan.org/tex-archive/info/
160 % latex, and pdflatex in dvi mode, support graphics in encapsulated
161 % postscript (.eps) format. pdflatex in pdf mode supports graphics
162 % in .pdf, .jpeg, .png and .mps (metapost) formats. Users should ensure
163 % that all non-photo figures use a vector format (.eps, .pdf, .mps) and
164 % not a bitmapped formats (.jpeg, .png). IEEE frowns on bitmapped formats
165 % which can result in "jaggedy"/blurry rendering of lines and letters as
166 % well as large increases in file sizes.
168 % You can find documentation about the pdfTeX application at:
169 % http://www.tug.org/applications/pdftex
175 % *** MATH PACKAGES ***
177 \usepackage[cmex10]{amsmath}
178 % A popular package from the American Mathematical Society that provides
179 % many useful and powerful commands for dealing with mathematics. If using
180 % it, be sure to load this package with the cmex10 option to ensure that
181 % only type 1 fonts will utilized at all point sizes. Without this option,
182 % it is possible that some math symbols, particularly those within
183 % footnotes, will be rendered in bitmap form which will result in a
184 % document that can not be IEEE Xplore compliant!
186 % Also, note that the amsmath package sets \interdisplaylinepenalty to 10000
187 % thus preventing page breaks from occurring within multiline equations. Use:
188 %\interdisplaylinepenalty=2500
189 % after loading amsmath to restore such page breaks as IEEEtran.cls normally
190 % does. amsmath.sty is already installed on most LaTeX systems. The latest
191 % version and documentation can be obtained at:
192 % http://www.ctan.org/tex-archive/macros/latex/required/amslatex/math/
198 % *** SPECIALIZED LIST PACKAGES ***
200 \usepackage[ruled,lined,linesnumbered]{algorithm2e}
201 %\usepackage{algorithmic}
202 % algorithmic.sty was written by Peter Williams and Rogerio Brito.
203 % This package provides an algorithmic environment fo describing algorithms.
204 % You can use the algorithmic environment in-text or within a figure
205 % environment to provide for a floating algorithm. Do NOT use the algorithm
206 % floating environment provided by algorithm.sty (by the same authors) or
207 % algorithm2e.sty (by Christophe Fiorio) as IEEE does not use dedicated
208 % algorithm float types and packages that provide these will not provide
209 % correct IEEE style captions. The latest version and documentation of
210 % algorithmic.sty can be obtained at:
211 % http://www.ctan.org/tex-archive/macros/latex/contrib/algorithms/
212 % There is also a support site at:
213 % http://algorithms.berlios.de/index.html
214 % Also of interest may be the (relatively newer and more customizable)
215 % algorithmicx.sty package by Szasz Janos:
216 % http://www.ctan.org/tex-archive/macros/latex/contrib/algorithmicx/
221 % *** ALIGNMENT PACKAGES ***
224 % Frank Mittelbach's and David Carlisle's array.sty patches and improves
225 % the standard LaTeX2e array and tabular environments to provide better
226 % appearance and additional user controls. As the default LaTeX2e table
227 % generation code is lacking to the point of almost being broken with
228 % respect to the quality of the end results, all users are strongly
229 % advised to use an enhanced (at the very least that provided by array.sty)
230 % set of table tools. array.sty is already installed on most systems. The
231 % latest version and documentation can be obtained at:
232 % http://www.ctan.org/tex-archive/macros/latex/required/tools/
237 % Also highly recommended is Mark Wooding's extremely powerful MDW tools,
238 % especially mdwmath.sty and mdwtab.sty which are used to format equations
239 % and tables, respectively. The MDWtools set is already installed on most
240 % LaTeX systems. The lastest version and documentation is available at:
241 % http://www.ctan.org/tex-archive/macros/latex/contrib/mdwtools/
244 % IEEEtran contains the IEEEeqnarray family of commands that can be used to
245 % generate multiline equations as well as matrices, tables, etc., of high
249 %\usepackage{eqparbox}
250 % Also of notable interest is Scott Pakin's eqparbox package for creating
251 % (automatically sized) equal width boxes - aka "natural width parboxes".
253 % http://www.ctan.org/tex-archive/macros/latex/contrib/eqparbox/
259 % *** SUBFIGURE PACKAGES ***
260 %\usepackage[tight,footnotesize]{subfigure}
261 % subfigure.sty was written by Steven Douglas Cochran. This package makes it
262 % easy to put subfigures in your figures. e.g., "Figure 1a and 1b". For IEEE
263 % work, it is a good idea to load it with the tight package option to reduce
264 % the amount of white space around the subfigures. subfigure.sty is already
265 % installed on most LaTeX systems. The latest version and documentation can
267 % http://www.ctan.org/tex-archive/obsolete/macros/latex/contrib/subfigure/
268 % subfigure.sty has been superceeded by subfig.sty.
272 %\usepackage[caption=false]{caption}
273 %\usepackage[font=footnotesize]{subfig}
274 % subfig.sty, also written by Steven Douglas Cochran, is the modern
275 % replacement for subfigure.sty. However, subfig.sty requires and
276 % automatically loads Axel Sommerfeldt's caption.sty which will override
277 % IEEEtran.cls handling of captions and this will result in nonIEEE style
278 % figure/table captions. To prevent this problem, be sure and preload
279 % caption.sty with its "caption=false" package option. This is will preserve
280 % IEEEtran.cls handing of captions. Version 1.3 (2005/06/28) and later
281 % (recommended due to many improvements over 1.2) of subfig.sty supports
282 % the caption=false option directly:
283 \usepackage[caption=false,font=footnotesize]{subfig}
285 % The latest version and documentation can be obtained at:
286 % http://www.ctan.org/tex-archive/macros/latex/contrib/subfig/
287 % The latest version and documentation of caption.sty can be obtained at:
288 % http://www.ctan.org/tex-archive/macros/latex/contrib/caption/
293 % *** FLOAT PACKAGES ***
295 \usepackage{fixltx2e}
296 % fixltx2e, the successor to the earlier fix2col.sty, was written by
297 % Frank Mittelbach and David Carlisle. This package corrects a few problems
298 % in the LaTeX2e kernel, the most notable of which is that in current
299 % LaTeX2e releases, the ordering of single and double column floats is not
300 % guaranteed to be preserved. Thus, an unpatched LaTeX2e can allow a
301 % single column figure to be placed prior to an earlier double column
302 % figure. The latest version and documentation can be found at:
303 % http://www.ctan.org/tex-archive/macros/latex/base/
307 %\usepackage{stfloats}
308 % stfloats.sty was written by Sigitas Tolusis. This package gives LaTeX2e
309 % the ability to do double column floats at the bottom of the page as well
310 % as the top. (e.g., "\begin{figure*}[!b]" is not normally possible in
311 % LaTeX2e). It also provides a command:
313 % to enable the placement of footnotes below bottom floats (the standard
314 % LaTeX2e kernel puts them above bottom floats). This is an invasive package
315 % which rewrites many portions of the LaTeX2e float routines. It may not work
316 % with other packages that modify the LaTeX2e float routines. The latest
317 % version and documentation can be obtained at:
318 % http://www.ctan.org/tex-archive/macros/latex/contrib/sttools/
319 % Documentation is contained in the stfloats.sty comments as well as in the
320 % presfull.pdf file. Do not use the stfloats baselinefloat ability as IEEE
321 % does not allow \baselineskip to stretch. Authors submitting work to the
322 % IEEE should note that IEEE rarely uses double column equations and
323 % that authors should try to avoid such use. Do not be tempted to use the
324 % cuted.sty or midfloat.sty packages (also by Sigitas Tolusis) as IEEE does
325 % not format its papers in such ways.
329 % correct bad hyphenation here
330 % \hyphenation{op-tical net-works semi-conduc-tor}
336 % can use linebreaks \\ within to get better formatting as desired
337 \title{Transposition GPU d'un algorithme de traitement d'images type 'snake'.}
340 % author names and affiliations
341 % use a multiple column layout for up to two different
345 \IEEEauthorblockN{Gilles Perrot$^1$, St\'{e}phane Domas$^1$, Rapha\"{e}l Couturier$^1$, Nicolas Bertaux$^2$}
347 \IEEEauthorblockA{$^1$Institut FEMTO-ST\\
348 Rue Engel Gros, 90000 Belfort, France.\\
349 prenom.nom@univ-fcomte.fr}
351 \IEEEauthorblockA{$^2$ Institut Fresnel, CNRS, Aix-Marseille Universit\'e, Ecole Centrale Marseille,\\
352 Campus de Saint-J\'er\^ome, 13013 Marseille, France.\\
353 nicolas.bertaux@ec-marseille.fr}
357 % use for special paper notices
358 %\IEEEspecialpapernotice{(Invited Paper)}
361 % make the title area
365 \section*{\label{abstract}Résumé}
366 En traitement d'images, une limite à l'utilisation d'un grand nombre d'algorithmes avancés demeure le temps de calcul de leurs implémentations.
367 Parallèlement, les gains obtenus aujourd'hui en puissance de calcul le sont essentiellement grâce l'emploi d'architectures parallèles de diverses natures.
368 Parmi ces solutions, les GPGPUs (processeurs graphiques à usage général) sont une réponse économique et performante pour l'implémentation de certaines classes d'algorithmes.
369 Toutefois, leur architecture très spécifique ainsi que l'interaction avec le processeur hôte ne permettent pas de garantir, dans tous les cas, une transposition simple et efficace. À tel point qu'il n'est pas rare de voir des implémentions GPU naïves moins performantes que l'implémentation CPU de référence.
370 Nous proposons d'illustrer ces aspects en s'appuyant sur l'exemple d'un algorithme de segmentation par contour actif, orienté régions (snake).
371 Après avoir rappelé brièvement le principe du \textit{snake} étudié, nous détaillerons la méthode employée pour paralléliser cet algorithme sur GPU, en nous focalisant sur certains aspects génériques et symptomatiques. Nous évoquerons en particulier, en donnant autant que possible des extraits de code :
373 \item la représentation des données. Elle est une des clés de la transposition des algorithmes et est étroitement liée au niveau de parallélisme appliqué.
374 \item la génération d'images intégrales. Elle représente un type de prétraitement très employé.
375 \item les étapes de réduction. Par exemple les sommes, tris ou recherches d'extrema. Elles sont un point délicat car les GPUs ne sont clairement pas conçus pour que ces traitements y soient exécutés de manière optimale. Un certain nombre de travaux apportent des réponses que l'on présentera.
376 \item l'optimisation de la grille de calcul. Contrairement à ce que préconise le principal fabricant de GPU et ainsi que l'a montré Volkov, la maximisation de l'occupation (occupancy) ne permet pas toujours de tirer le meilleur parti de la puissance de calcul du GPU. De ce point de vue, les architectures récentes peuvent représenter un recul.
378 Chacun des points abordés sera également l'occasion de présenter les spécificités et les contraintes des accès aux différentes zones mémoire des GPU (globale, partagée, textures, registres).