X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/book_gpu.git/blobdiff_plain/17bff40b83bcdcc39769f9e59c70ffae1c525b72..2ce2baf7820f44ab044b4df98722576116551e57:/BookGPU/Chapters/chapter16/gpu.tex?ds=sidebyside diff --git a/BookGPU/Chapters/chapter16/gpu.tex b/BookGPU/Chapters/chapter16/gpu.tex index 26bd536..4d4d6ef 100644 --- a/BookGPU/Chapters/chapter16/gpu.tex +++ b/BookGPU/Chapters/chapter16/gpu.tex @@ -5,7 +5,7 @@ In this section, we explain how to efficiently use matrix-free GMRES to solve the Newton update problems with implicit sensitivity calculation, i.e., the steps enclosed by the double dashed block -in Fig.~\ref{fig:ef_flow}. +in Figure~\ref{fig:ef_flow}. Then implementation issues of GPU acceleration will be discussed in detail. Finally, the Gear-2 integration is briefly introduced. @@ -78,7 +78,7 @@ a preset tolerance~\cite{Golub:Book'96}. %% \end{algorithm} \begin{algorithm} -\caption{Standard GMRES\index{iterative method!GMRES} algorithm.} \label{alg:GMRES} +\caption{standard GMRES\index{iterative method!GMRES} algorithm} \label{alg:GMRES} \KwIn{ $ A \in \mathbb{R}^{N \times N}$, $b \in \mathbb{R}^N$, and initial guess $x_0 \in \mathbb{R}^N$} \KwOut{ $x \in \mathbb{R}^N$: $\| b - A x\|_2 < tol$} @@ -160,7 +160,7 @@ period in order to solve a Newton update. At each time step, SPICE\index{SPICE} has to linearize device models, stamp matrix elements into MNA (short for modified nodal analysis\index{modified nodal analysis, or MNA}) matrices, -and solve circuit equations in its inner Newton iteration\index{Newton iteration}. +and solve circuit equations in its inner Newton iteration\index{iterative method!Newton iteration}. When convergence is attained, circuit states are saved and then next time step begins. This is also the time when we store the needed matrices @@ -225,7 +225,7 @@ Hence, in consideration of the serial nature of the trianularization, the small size of Hessenberg matrix, and the frequent inspection of values by host, it is preferable to allocate $\tilde{H}$ in CPU (host) memory. -As shown in Fig.~\ref{fig:gmres}, the memory copy from device to host +As shown in Figure~\ref{fig:gmres}, the memory copy from device to host is called each time when Arnoldi iteration generates a new vector and the orthogonalization produces the vector $h$.