From: couturie Date: Sat, 21 Sep 2013 18:48:43 +0000 (+0200) Subject: new X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/book_gpu.git/commitdiff_plain/0f26b548f3029a96eafd6667024aabe2b36e464b?ds=inline new --- diff --git a/BookGPU/Chapters/chapter11/biblio11.bib b/BookGPU/Chapters/chapter11/biblio11.bib index 09fe659..784c28d 100644 --- a/BookGPU/Chapters/chapter11/biblio11.bib +++ b/BookGPU/Chapters/chapter11/biblio11.bib @@ -8278,9 +8278,9 @@ standard review}, @misc{Thrust, author = {Hoberock, J. and Bell, N.}, - title = {Thrust, http://thrust.github.com/ Last accessed July 31}, + title = {Thrust, 2012. http://thrust.github.com/ Last accessed {J}uly 31}, volume = {Last accessed July 31}, - year = {2012} + } author = {Hock, J.C. and Stern, A.S.}, diff --git a/BookGPU/Chapters/chapter11/ch11.tex b/BookGPU/Chapters/chapter11/ch11.tex index 0aa6e8c..736d68a 100644 --- a/BookGPU/Chapters/chapter11/ch11.tex +++ b/BookGPU/Chapters/chapter11/ch11.tex @@ -45,7 +45,7 @@ In this work we examine several monotone spline fitting algorithms, and select t The rest of the chapter is organized as follows. Section \ref{ch11:splines} discusses monotone spline interpolation methods and presents two parallel algorithms. Section \ref{ch11:smoothing} deals with the smoothing problem. It presents the isotonic regression problem and discusses the Pool Adjacent Violators (PAV) and MLS algorithms. Combined with monotone spline interpolation, the parallel MLS method makes it possible to build a monotone spline approximation to noisy data entirely on GPU. Section \ref{ch11:conc} concludes. - +\clearpage \section{Monotone splines} \label{ch11:splines} \index{constrained splines} \index{monotonicity} diff --git a/BookGPU/Chapters/chapter12/ch12.tex b/BookGPU/Chapters/chapter12/ch12.tex index 4fe0eb9..5c0a5b2 100755 --- a/BookGPU/Chapters/chapter12/ch12.tex +++ b/BookGPU/Chapters/chapter12/ch12.tex @@ -548,8 +548,10 @@ which are the number of rows, the total number of nonzero values, and the maxima the present chapter, the bandwidth of a sparse matrix is defined as the number of matrix columns separating the first and the last nonzero value on a matrix row. + \begin{table} \centering +\begin{small} \begin{tabular}{|c|c|c|c|c|} \hline {\bf Matrix Type} & {\bf Matrix Name} & {\bf \# Rows} & {\bf \# Nonzeros} & {\bf Bandwidth} \\ \hline \hline @@ -578,10 +580,12 @@ the first and the last nonzero value on a matrix row. & torso3 & $259,156$ & $4,429,042$ & $216,854$ \\ \hline \end{tabular} +\end{small} \caption{Main characteristics of sparse matrices chosen from the University of Florida collection.} \label{ch12:tab:01} \end{table} + \begin{table}[!h] \begin{center} \begin{tabular}{|c|c|c|c|c|c|c|} @@ -607,6 +611,7 @@ thermal2 & $1.172s$ & $0.622s$ & $1.88$ & $ \begin{table}[!h] \begin{center} +\begin{small} \begin{tabular}{|c|c|c|c|c|c|c|} \hline {\bf Matrix} & $\mathbf{Time_{cpu}}$ & $\mathbf{Time_{gpu}}$ & $\mathbf{\tau}$ & $\mathbf{\#~Iter.}$ & $\mathbf{Prec.}$ & $\mathbf{\Delta}$ \\ \hline \hline @@ -635,6 +640,7 @@ poli\_large & $0.097s$ & $0.095s$ & $1.02$ & $ torso3 & $4.242s$ & $2.030s$ & $2.09$ & $175$ & $2.69e$-$10$ & $1.78e$-$14$ \\ \hline \end{tabular} +\end{small} \caption{Performances of the parallel GMRES method on a cluster 24 CPU cores vs. on cluster of 12 GPUs.} \label{ch12:tab:03} \end{center} @@ -742,7 +748,7 @@ are better than those of the GMRES method for solving large symmetric linear sys CG method is characterized by a better convergence\index{convergence} rate and a shorter execution time of an iteration than those of the GMRES method. Moreover, an iteration of the parallel GMRES method requires more data exchanges between computing nodes compared to the parallel CG method. - +\clearpage \begin{table}[!h] \begin{center} \begin{tabular}{|c|c|c|c|c|c|c|} @@ -769,6 +775,7 @@ on a cluster of 12 GPUs.} \begin{table}[!h] \begin{center} +\begin{small} \begin{tabular}{|c|c|c|c|c|c|c|} \hline {\bf Matrix} & $\mathbf{Time_{cpu}}$ & $\mathbf{Time_{gpu}}$ & $\mathbf{\tau}$ & $\mathbf{\#~Iter.}$ & $\mathbf{Prec.}$ & $\mathbf{\Delta}$ \\ \hline \hline @@ -797,6 +804,7 @@ poli\_large & $8.515s$ & $1.053s$ & $8.09$ torso3 & $31.463s$ & $3.681s$ & $8.55$ & $175$ & $2.69e$-$10$ & $2.66e$-$14$ \\ \hline \end{tabular} +\end{small} \caption{Performances of the parallel GMRES method for solving linear systems associated to sparse banded matrices on a cluster of 24 CPU cores vs. on a cluster of 12 GPUs.} \label{ch12:tab:06} diff --git a/BookGPU/Chapters/chapter7/ch7.tex b/BookGPU/Chapters/chapter7/ch7.tex index 6fd9c62..e084f01 100644 --- a/BookGPU/Chapters/chapter7/ch7.tex +++ b/BookGPU/Chapters/chapter7/ch7.tex @@ -789,7 +789,8 @@ Last, we demonstrate using a classical benchmark for propagation of nonlinear wa % A harmonic analysis of the wave spectrum at the shoal center line is computed and plotted in Figure \ref{ch7:whalinresults} for comparison with the analogous results obtained from the experiments data. The three harmonic amplitudes are computed via a Fast Fourier Transform (FFT) method using the last three wave periods up to $t=50\,$s. There is a satisfactory agreement between the computed and experimental results and no noticeable loss in accuracy resulting from the use of single-precision math. -% + +\pagebreak \begin{figure}[!htb] \setlength\figureheight{0.3\textwidth} \setlength\figurewidth{0.32\textwidth} @@ -930,5 +931,5 @@ We anticipate that a tool based on the proposed parallel solution strategies wil \section{Acknowledgments} This work was supported by grant no. 09-070032 from the Danish Research Council for Technology and Production Sciences. A special thank goes to Professor Jan S. Hesthaven for supporting parts of this work. Scalability and performance tests was done in the GPUlab at DTU Informatics, Technical University of Denmark and using the GPU-cluster at Center for Computing and Visualization, Brown University, USA. NVIDIA Corporation is acknowledged for generous hardware donations to facilities of the GPUlab. - +\clearpage \putbib[Chapters/chapter7/biblio7]