-\chapterauthor{Raphaël Couturier}{Femto-ST Institute, University of Franche-Comte}
+\chapterauthor{Raphaël Couturier}{Femto-ST Institute, University of Franche-Comte, France}
\chapter{Presentation of the GPU architecture and of the Cuda environment}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\chapterauthor{}{}
-\chapterauthor{Lilia Ziane Khodja}{Femto-ST Institute, University of Franche-Comte, France}
-\chapterauthor{Raphaël Couturier}{Femto-ST Institute, University of Franche-Comte, France}
-\chapterauthor{Jacques Bahi}{Femto-ST Institute, University of Franche-Comte, France}
+\chapterauthor{Lilia Ziane Khodja, Raphaël Couturier and Jacques Bahi}{Femto-ST Institute, University of Franche-Comte, France}
+%\chapterauthor{Raphaël Couturier}{Femto-ST Institute, University of Franche-Comte, France}
+%\chapterauthor{Jacques Bahi}{Femto-ST Institute, University of Franche-Comte, France}
\chapter[Solving linear systems with GMRES and CG methods on GPU clusters]{Solving sparse linear systems with GMRES and CG methods on GPU clusters}
\label{ch12}
-\chapterauthor{Raphaël Couturier}{Femto-ST Institute, University of Franche-Comte}
+\chapterauthor{Raphaël Couturier}{Femto-ST Institute, University of Franche-Comte, France}
\chapter{Introduction to Cuda}
\label{chapter2}
\@writefile{toc}{\contentsline {section}{\numberline {3.2}Performance measurements}{28}}
\newlabel{lst:chronos}{{3.4}{28}}
\@writefile{lol}{\contentsline {lstlisting}{\numberline {3.4}Time measurement technique using cutil functions}{28}}
+\@writefile{toc}{\author{Gilles Perrot}{}}
\@writefile{loa}{\addvspace {10\p@ }}
\@writefile{toc}{\contentsline {chapter}{\numberline {4}Implementing a fast median filter}{31}}
\@writefile{lof}{\addvspace {10\p@ }}
\@writefile{lot}{\addvspace {10\p@ }}
-\@writefile{toc}{\author{Gilles Perrot}{}}
\@writefile{toc}{\contentsline {section}{\numberline {4.1}Introduction}{31}}
\@writefile{toc}{\contentsline {section}{\numberline {4.2}Median filtering}{32}}
\@writefile{toc}{\contentsline {subsection}{\numberline {4.2.1}Basic principles}{32}}
\setcounter{subparagraph}{0}
\setcounter{figure}{11}
\setcounter{table}{3}
-\setcounter{numauthors}{1}
+\setcounter{numauthors}{0}
\setcounter{parentequation}{0}
\setcounter{subfigure}{0}
\setcounter{lofdepth}{1}
-\chapterauthor{Gilles Perrot}{FEMTO-ST Institute}
+\chapterauthor{Gilles Perrot}{Femto-ST Institute, University of Franche-Comte, France}
%\graphicspath{{img/}}
In order to estimate the potential for improvement of each kernel, a reference throughput measurement, involving identity kernel of Listing \ref{lst:fkern1}, was performed. As this kernel only fetches input values from texture memory and outputs them to global memory without doing any computation, it represents the smallest, thus fastest, possible process and is taken as the reference throughput value (100\%). The same measurement was performed on CPU, with a maximum effective pixel throughput of 130~Mpixel per second. On GPU, depending on grid parameters it amounts to 800~MPixels/s on GTX280 and 1300~Mpixels/s on C2070.
+\chapterauthor{Gilles Perrot}{Femto-ST Institute, University of Franche-Comte, France}
+
\chapter{Implementing a fast median filter}
-\chapterauthor{Gilles Perrot}{FEMTO-ST Institute}
\section{Introduction}
Median filtering is a well-known method used in a wide range of application frameworks as well as a standalone filter especially for \textit{salt and pepper} denoising. It is able to highly reduce power of noise without blurring edges too much.
-\chapterauthor{Gilles Perrot}{FEMTO-ST Institute}
+\chapterauthor{Gilles Perrot}{Femto-ST Institute, University of Franche-Comte, France}
+
+\chapter{Implementing an efficient convolution operation on GPU}
+
%\newcommand{\kl}{\includegraphics[scale=0.6]{Chapters/chapter4/img/kernLeft.png}~}
%\newcommand{\kr}{\includegraphics[scale=0.6]{Chapters/chapter4/img/kernRight.png}}
%% }
-\chapter{Implementing an efficient convolution \index{Convolution} operation on GPU}
+
\section{Overview}
In this chapter, after dealing with GPU median filter implementations,
-we propose to explore how convolutions can be implemented on modern
+we propose to explore how convolutions\index{Convolution} can be implemented on modern
GPUs. Widely used in digital image processing filters, the \emph{convolution
operation} basically consists in taking the sum of products of elements
from two 2-D functions, letting one of the two functions move over
\@starttoc{lof}%
\if@restonecol\twocolumn\fi
}
-\newcommand*\l@figure{\@dottedtocline{1}{1.5em}{2.3em}}
+\newcommand*\l@figure{\@dottedtocline{1}{1.5em}{2.8em}}
\newcommand\listoftables{%
\if@twocolumn
\@restonecoltrue\onecolumn