12-02-2014 2

author lilia <lilia@amazigh.bordeaux.inria.fr>

Wed, 12 Feb 2014 12:14:29 +0000 (13:14 +0100)

committer lilia <lilia@amazigh.bordeaux.inria.fr>

Wed, 12 Feb 2014 12:14:29 +0000 (13:14 +0100)
author lilia <lilia@amazigh.bordeaux.inria.fr>
Wed, 12 Feb 2014 12:14:29 +0000 (13:14 +0100)
committer lilia <lilia@amazigh.bordeaux.inria.fr>
Wed, 12 Feb 2014 12:14:29 +0000 (13:14 +0100)
diff --git a/GMRES_Journal.tex b/GMRES_Journal.tex

index 37acbcc10e3c23b5c7870d76c8c297cfa45b2eeb..e7437781ef1dcc435b25885b28cd5c7eca0a85d9 100644 (file)
--- a/GMRES_Journal.tex
+++ b/GMRES_Journal.tex
@@ -1,5 +1,4 @@
  \documentclass[11pt]{article}
-%\documentclass{acmconf}
  \usepackage{multicol}
  
  \usepackage[paper=a4paper,dvips,top=1.5cm,left=1.5cm,right=1.5cm,foot=1cm,bottom=1.5cm]{geometry}
@@ -19,7 +18,6 @@
  \usepackage{url}
  \usepackage{mdwlist}
  \usepackage{multirow}
-%\usepackage{color}
  
  \date{}
  
@@ -855,7 +853,7 @@ torso3                  & 183 863 292      & 25 682 514       & 613 250
  
  
  
-Hereafter, we show the influence of the communications on a GPU cluster compared to a CPU cluster. In Tables~\ref{tab:10},~\ref{tab:11} and~\ref{tab:12}, we compute the ratios between the computation time over the communication time of three versions of the parallel GMRES algorithm to solve sparse linear systems associated to matrices of Table~\ref{tab:06}. These tables show that the hypergraph partitioning and the compressed format of the vectors increase the ratios either on the GPU cluster or on the CPU cluster. That means that the two optimization techniques allow the minimization of the total communication volume between the computing nodes. However, we can notice that the ratios obtained on the GPU cluster are lower than those obtained on the CPU cluster. Indeed, GPUs compute faster than CPUs but with GPUs there are more communications due to CPU/GPU communications, so communications are more time-consuming while the computation time remains unchanged.
+Hereafter, we show the influence of the communications on a GPU cluster compared to a CPU cluster. In Tables~\ref{tab:10},~\ref{tab:11} and~\ref{tab:12}, we compute the ratios between the computation time over the communication time of three versions of the parallel GMRES algorithm to solve sparse linear systems associated to matrices of Table~\ref{tab:06}. These tables show that the hypergraph partitioning and the compressed format of the vectors increase the ratios either on the GPU cluster or on the CPU cluster. That means that the two optimization techniques allow the minimization of the total communication volume between the computing nodes. However, we can notice that the ratios obtained on the GPU cluster are lower than those obtained on the CPU cluster. Indeed, GPUs compute faster than CPUs but with GPUs there are more communications due to CPU/GPU communications, so communications are more time-consuming while the computation time remains unchanged. Furthermore, we can notice that the GPU computation times on Tables~\ref{tab:11} and~\ref{tab:12} are about 10\% lower than those on Table~\ref{tab:10}. Indeed, the compression of the vectors and the reordering of matrix columns allow to perform coalesced accesses to the GPU memory and thus accelerate the sparse matrix-vector multiplication.  
  
  \begin{table}
  \begin{center}
author	lilia <lilia@amazigh.bordeaux.inria.fr>
	Wed, 12 Feb 2014 12:14:29 +0000 (13:14 +0100)
committer	lilia <lilia@amazigh.bordeaux.inria.fr>
	Wed, 12 Feb 2014 12:14:29 +0000 (13:14 +0100)