The initialization values of the vector solution
of the methods are given in %Section~\ref{sec:vec_initialization}.
-\subsection{Test with (CUDA OpenMP) approach}
+\subsection{Test with Multi-GPU (CUDA OpenMP) approach}
-In this part we performed a set of experiments with (CUDA OpenMP) approach on full and sparse polynomials of different degrees.
+In this part we performed a set of experiments on Multi-GPU (CUDA OpenMP) approach for full and sparse polynomials of different degrees, compare it with Single GPU (CUDA).
\subsubsection{Execution times in seconds of the Ehrlich-Aberth method for solving sparse polynomials on GPUs using shared memory paradigm with OpenMP}
- In this test we report the execution time of the EA algorithm, on single GPU and Multi-GPU with (2,3,4) GPUs, for different sparse polynomial degrees ranging from 100,000 to 1,400,000
-\begin{figure}[htbp]
+ In this experiments we report the execution time of the EA algorithm, on single GPU and Multi-GPU with (2,3,4) GPUs, for different sparse polynomial degrees ranging from 100,000 to 1,400,000
+\begin{figure}[htbp]
\centering
\includegraphics[angle=-90,width=0.5\textwidth]{Sparse_omp}
\caption{Execution times in seconds of the Ehrlich-Aberth method for solving sparse polynomials on GPUs using shared memory paradigm with OpenMP}
\label{fig:01}
\end{figure}
-in this figure~\ref{fig:01} shows that (CUDA OpenMP) Multi-GPU approach reduce the execution time up to the scale 100 whereas single GPU is of scale 1000 for polynomial who exceed 1,000,000. It shows the advantage to use OpenMP parallel paradigm to connect the performances of several GPUs and solve a high polynomial of degrees.
+
+This figure~\ref{fig:01} shows that (CUDA OpenMP) Multi-GPU approach reduce the execution time up to the scale 100 whereas single GPU is of scale 1000 for polynomial who exceed 1,000,000. It shows the advantage to use OpenMP parallel paradigm to connect the performances of several GPUs and solve a polynomial of high degrees.
\subsubsection{Execution times in seconds of the Ehrlich-Aberth method for solving full polynomials on GPUs using shared memory paradigm with OpenMP}
+This experiments shows the execution time of the EA algorithm, on single GPU (CUDA) and Multi-GPU (CUDA OpenMP) approach for full polynomials of degrees ranging from 100,000 to 1,400,000
\begin{figure}[htbp]
\centering
\label{fig:03}
\end{figure}
+The second test with full polynomial shows a very important saving of time, for a polynomial of degrees 1,4M (CUDA OpenMP) approach with 4 GPUs compute and solve it 4 times as fast as single GPU. We notice that curves are positioned one below the other one, more the number of used GPUs increases more the execution time decreases.
+
+\subsection{Test with Multi-GPU (CUDA MPI) approach}
+In this part we perform a set of experiment to compare Multi-GPU (CUDA MPI) approach with single GPU, for solving full and sparse polynomials of degrees ranging from 100,000 to 1,400,000.
+
+\subsubsection{Execution times in seconds of the Ehrlich-Aberth method for solving sparse polynomials on GPUs using distributed memory paradigm with MPI}
\begin{figure}[htbp]
\centering
\caption{Execution times in seconds of the Ehrlich-Aberth method for solving sparse polynomials on GPUs using distributed memory paradigm with MPI}
\label{fig:02}
\end{figure}
-
+~\\
+This figure shows 4 curves of execution time of EA algorithm, a curve with single GPU, 3 curves with Multi-GPUs (2, 3, 4) GPUs. We see clearly that the curve with single GPU is above the other curves, which shows consumption in execution time compared to the Multi-GPU. We can see the approach Multi-GPU (CUDA MPI) reduces the execution time up to the scale 100 for polynomial of degrees more than 1,000,000 whereas single GPU is of the scale 1000.
+\\
+\subsubsection{Execution times in seconds of the Ehrlich-Aberth method for solving full polynomials on GPUs using distributed memory paradigm with MPI}
\begin{figure}[htbp]
\centering