X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/book_gpu.git/blobdiff_plain/16b4107f3a70f199b9c04d0346050f4f52b5114b..cf0ee588a0ef05623f955938e81be1aa96c33f46:/BookGPU/Chapters/chapter11/ch11.tex diff --git a/BookGPU/Chapters/chapter11/ch11.tex b/BookGPU/Chapters/chapter11/ch11.tex index 57b3328..736d68a 100644 --- a/BookGPU/Chapters/chapter11/ch11.tex +++ b/BookGPU/Chapters/chapter11/ch11.tex @@ -45,7 +45,7 @@ In this work we examine several monotone spline fitting algorithms, and select t The rest of the chapter is organized as follows. Section \ref{ch11:splines} discusses monotone spline interpolation methods and presents two parallel algorithms. Section \ref{ch11:smoothing} deals with the smoothing problem. It presents the isotonic regression problem and discusses the Pool Adjacent Violators (PAV) and MLS algorithms. Combined with monotone spline interpolation, the parallel MLS method makes it possible to build a monotone spline approximation to noisy data entirely on GPU. Section \ref{ch11:conc} concludes. - +\clearpage \section{Monotone splines} \label{ch11:splines} \index{constrained splines} \index{monotonicity} @@ -116,8 +116,8 @@ It is almost straightforward to parallelize this scheme for GPUs, by processing At the spline evaluation stage we need to compute $s(z_k)$ for a sequence of query values ${z_k}, k=1,\ldots,K$. For each $z_k$ we locate the interval $[t_i,t_{i+1}]$ containing $z_k$, using the bisection algorithm presented in Listing \ref{ch11:algeval}, and then apply the appropriate coefficients of the quadratic function. This is also done in parallel. The bisection algorithm could be implemented using texture memory (to cache the array \texttt{z}), but this is not shown in Listing \ref{ch11:algeval}. -%\pagebreak -\lstinputlisting[label=ch11:algcoef,caption=Implementation of the kernel for calculating spline knots and coefficients; function fmax is used to avoid division by zero for data with coinciding abscissae.]{Chapters/chapter11/code1.cu} +\pagebreak +\lstinputlisting[label=ch11:algcoef,caption=implementation of the kernel for calculating spline knots and coefficients; function fmax is used to avoid division by zero for data with coinciding abscissae.]{Chapters/chapter11/code1.cu} %% \begin{figure}[!hp] @@ -346,7 +346,7 @@ A popular PAV algorithm (PAVA) is one method that provides efficient numerical s Various serial implementations of the PAVA exist. It is noted \cite{Kearsley_2006} that in PAVA, which is based on the ideas from convex analysis, a decomposition theorem holds, namely, performing PAVA separately on two contiguous subsets of data and then performing PAVA on the result produces isotonic regression on the whole data set. Thus, isotonic regression is parallelizable, and the divide-and-conquer approach, decomposing the original problem into two smaller subproblems, can be implemented on multiple processors. However, to our knowledge, no parallel PAVA for many-core systems such as GPUs exist. -\index{MLS algorithm} \index{Minimum Lower Sets} +\index{MLS (minimum lower sets) algorithm} Another approach to isotonic regression is called the MLS algorithm \cite{Best1990, Robertson_book}. It provides the same solution as the PAVA, but works differently. For each datum (or block), MLS selects the largest contiguous block of subsequent data with the smallest average. If this average is smaller than that of the preceding block, the blocks are merged, and the data in the block are replaced with their average. MLS is also an active set method \cite{Best1990}, but its complexity is $O(n^2)$ as opposed to $O(n)$ of the PAVA, and of another active set algorithm proposed in \cite{Best1990} by the name of Algorithm A.