\r
@misc{Thrust,\r
author = {Hoberock, J. and Bell, N.},\r
- title = {Thrust, http://thrust.github.com/ Last accessed July 31},\r
+ title = {Thrust, 2012. http://thrust.github.com/ Last accessed {J}uly 31},\r
volume = {Last accessed July 31},\r
- year = {2012}\r
+\r
}\r
\r
author = {Hock, J.C. and Stern, A.S.},\r
The rest of the chapter is organized as follows. Section \ref{ch11:splines} discusses monotone spline interpolation methods and presents two parallel algorithms. Section \ref{ch11:smoothing} deals with the smoothing problem. It presents the isotonic regression problem and discusses the Pool Adjacent Violators (PAV) and MLS algorithms. Combined with monotone spline interpolation, the parallel MLS method makes it possible to build a monotone spline approximation to noisy data entirely on GPU. Section \ref{ch11:conc} concludes.
-
+\clearpage
\section{Monotone splines} \label{ch11:splines}
\index{constrained splines} \index{monotonicity}
the present chapter, the bandwidth of a sparse matrix is defined as the number of matrix columns separating
the first and the last nonzero value on a matrix row.
+
\begin{table}
\centering
+\begin{small}
\begin{tabular}{|c|c|c|c|c|}
\hline
{\bf Matrix Type} & {\bf Matrix Name} & {\bf \# Rows} & {\bf \# Nonzeros} & {\bf Bandwidth} \\ \hline \hline
& torso3 & $259,156$ & $4,429,042$ & $216,854$ \\ \hline
\end{tabular}
+\end{small}
\caption{Main characteristics of sparse matrices chosen from the University of Florida collection.}
\label{ch12:tab:01}
\end{table}
+
\begin{table}[!h]
\begin{center}
\begin{tabular}{|c|c|c|c|c|c|c|}
\begin{table}[!h]
\begin{center}
+\begin{small}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline
{\bf Matrix} & $\mathbf{Time_{cpu}}$ & $\mathbf{Time_{gpu}}$ & $\mathbf{\tau}$ & $\mathbf{\#~Iter.}$ & $\mathbf{Prec.}$ & $\mathbf{\Delta}$ \\ \hline \hline
torso3 & $4.242s$ & $2.030s$ & $2.09$ & $175$ & $2.69e$-$10$ & $1.78e$-$14$ \\ \hline
\end{tabular}
+\end{small}
\caption{Performances of the parallel GMRES method on a cluster 24 CPU cores vs. on cluster of 12 GPUs.}
\label{ch12:tab:03}
\end{center}
CG method is characterized by a better convergence\index{convergence} rate and a shorter execution
time of an iteration than those of the GMRES method. Moreover, an iteration of the parallel GMRES
method requires more data exchanges between computing nodes compared to the parallel CG method.
-
+\clearpage
\begin{table}[!h]
\begin{center}
\begin{tabular}{|c|c|c|c|c|c|c|}
\begin{table}[!h]
\begin{center}
+\begin{small}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline
{\bf Matrix} & $\mathbf{Time_{cpu}}$ & $\mathbf{Time_{gpu}}$ & $\mathbf{\tau}$ & $\mathbf{\#~Iter.}$ & $\mathbf{Prec.}$ & $\mathbf{\Delta}$ \\ \hline \hline
torso3 & $31.463s$ & $3.681s$ & $8.55$ & $175$ & $2.69e$-$10$ & $2.66e$-$14$ \\ \hline
\end{tabular}
+\end{small}
\caption{Performances of the parallel GMRES method for solving linear systems associated to sparse banded matrices on a cluster of 24 CPU cores vs.
on a cluster of 12 GPUs.}
\label{ch12:tab:06}
%
A harmonic analysis of the wave spectrum at the shoal center line is computed and plotted in Figure \ref{ch7:whalinresults} for comparison with the analogous results obtained from the experiments data. The three harmonic amplitudes are computed via a Fast Fourier Transform (FFT) method using the last three wave periods up to $t=50\,$s. There is a satisfactory agreement between the computed and experimental results and no noticeable loss in accuracy resulting from the use of single-precision math.
-%
+
+\pagebreak
\begin{figure}[!htb]
\setlength\figureheight{0.3\textwidth}
\setlength\figurewidth{0.32\textwidth}
\section{Acknowledgments}
This work was supported by grant no. 09-070032 from the Danish Research Council for Technology and Production Sciences. A special thank goes to Professor Jan S. Hesthaven for supporting parts of this work. Scalability and performance tests was done in the GPUlab at DTU Informatics, Technical University of Denmark and using the GPU-cluster at Center for Computing and Visualization, Brown University, USA. NVIDIA Corporation is acknowledged for generous hardware donations to facilities of the GPUlab.
-
+\clearpage
\putbib[Chapters/chapter7/biblio7]