X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/hpcc2014.git/blobdiff_plain/7314cfe257c8b75f34a34995a4a2075edc1d3888..d44333276c7703cc17f0052a7df0a4b25ba1ded4:/hpcc.tex?ds=inline

diff --git a/hpcc.tex b/hpcc.tex
index 865fb94..67609d7 100644
--- a/hpcc.tex
+++ b/hpcc.tex
@@ -448,13 +448,13 @@ and with the addition of the primitive MPI\_Test was needed to avoid a memory fa
 \CER{On voulait en fait montrer la simplicitÃ© de l'adaptation de l'algo a SimGrid. Les problÃ¨mes rencontrÃ©s dÃ©crits dans ce paragraphe concerne surtout le mode async}\LZK{OK. J'aurais prÃ©fÃ©rÃ© avoir un peu plus de dÃ©tails sur l'adaptation de la version async} 
 \CER{Le problÃ¨me majeur sur l'adaptation MPI vers SMPI pour la partie asynchrone de l'algorithme a Ã©tÃ© le plantage en SMPI de Waitall aprÃ¨s un Isend et Irecv. J'avais proposÃ© un workaround en utilisant un MPI\_wait sÃ©parÃ© pour chaque Ã©change a la place d'un waitall unique pour TOUTES les Ã©changes, une instruction qui semble bien fonctionner en MPI. Ce workaround aussi fonctionne bien. Mais aprÃ¨s, tu as modifiÃ© le programme avec l'ajout d'un MPI\_Test, au niveau de la routine de dÃ©tection de la convergence et du coup, l'Ã©change global avec waitall a aussi fonctionnÃ©.}
 Note here that the use of SMPI functions optimizer for memory footprint and CPU usage is not recommended knowing that one wants to get real results by simulation.
-As mentioned, upon this adaptation, the algorithm is executed as in the real life in the simulated environment after the following minor changes. First, all declared 
-global variables have been moved to local variables for each subroutine. In fact, global variables generate side effects arising from the concurrent access of 
-shared memory used by threads simulating each computing unit in the SimGrid architecture. Second, the alignment of certain types of variables such as ``long int'' had
-also to be reviewed.
+As mentioned, upon this adaptation, the algorithm is executed as in the real life in the simulated environment after the following minor changes. First, the scope of all declared 
+global variables have been moved to local to subroutine. Indeed, global variables generate side effects arising from the concurrent access of 
+shared memory used by threads simulating each computing unit in the SimGrid architecture. 
+%Second, the alignment of certain types of variables such as ``long int'' had also to be reviewed.
 \AG{Ã propos de ces problÃ¨mes d'alignement, en dire plus si Ã§a a un intÃ©rÃªt, ou l'enlever.}
-\CER{Ce problÃ¨me fait partie des modifications que j'ai dÃ» faire dans l'adaptation du programme MPI vers SMPI. IL dÃ©coule de la diffÃ©rence de la taille des mots en mÃ©moire : en 32 bits, pour les variables declarees en long int, on garde dans les instructions de sortie (printf, sprintf, ...) le format \%lu sinon en 64 bits, on le substitue par \%llu.} 
- Finally, some compilation errors on MPI\_Waitall and MPI\_Finalize primitives have been fixed with the latest version of SimGrid.
+\CER{Ce problÃ¨me fait partie des modifications que j'ai dÃ» faire dans l'adaptation du programme MPI vers SMPI. IL dÃ©coule de la diffÃ©rence de la taille des mots en mÃ©moire : en 32 bits, pour les variables declarees en long int, on garde dans les instructions de sortie (printf, sprintf, ...) le format \%lu sinon en 64 bits, on le substitue par \%llu. La phrase a Ã©tÃ© enlevÃ©.} 
+Second, some compilation errors on MPI\_Waitall and MPI\_Finalize primitives have been fixed with the latest version of SimGrid.
 In total, the initial MPI program running on the simulation environment SMPI gave after a very simple adaptation the same results as those obtained in a real 
 environment. We have successfully executed the code in synchronous mode using parallel GMRES algorithm compared with our multisplitting algorithm in asynchronous mode after few modifications. 
 
@@ -512,51 +512,51 @@ $\text{62}^\text{3} = \text{\np{238328}}$ to $\text{150}^\text{3} =
   \caption{2 clusters, each with 50 nodes}
   \label{tab.cluster.2x50}
 
-  \begin{mytable}{6}
+  \begin{mytable}{5}
     \hline
-    bandwidth (Mbits/s)
-    & 5         & 5         & 5         & 5         & 5         & 50 \\
+    bandwidth (Mbit/s)
+    & 5         & 5         & 5         & 5         & 5         \\
     \hline
     latency (ms)
-    & 0.02      & 0.02      & 0.02      & 0.02      & 0.02      & 0.02 \\
+    & 0.02      & 0.02      & 0.02      & 0.02      & 0.02      \\
     \hline
     power (GFlops)
-    & 1         & 1         & 1         & 1.5       & 1.5       & 1.5 \\
+    & 1         & 1         & 1         & 1.5       & 1.5       \\
     \hline
     size
-    & 62        & 62        & 62        & 100       & 100       & 110 \\
+    & 62        & 62        & 62        & 100       & 100       \\
     \hline
     Precision
-    & \np{E-5}   & \np{E-8}  & \np{E-9}  & \np{E-11} & \np{E-11} & \np{E-11} \\
+    & \np{E-5}  & \np{E-8}  & \np{E-9}  & \np{E-11} & \np{E-11} \\
     \hline
     \hline
     Relative gain
-    & 2.52     & 2.55     & 2.52     & 2.57     & 2.54     & 2.53 \\
+    & 2.52      & 2.55      & 2.52      & 2.57      & 2.54      \\
     \hline
   \end{mytable}
 
   \bigskip
 
-  \begin{mytable}{6}
+  \begin{mytable}{5}
     \hline
-    bandwidth (Mbits/s)
-    & 50        & 50        & 50        & 50 \\ %       & 10        & 10 \\
+    bandwidth (Mbit/s)
+    & 50        & 50        & 50        & 50        & 50 \\ %       & 10        & 10 \\
     \hline
     latency (ms)
-    & 0.02      & 0.02      & 0.02      & 0.02 \\ %      & 0.03      & 0.01 \\
+    & 0.02      & 0.02      & 0.02      & 0.02      & 0.02 \\ %      & 0.03      & 0.01 \\
     \hline
     Power (GFlops)
-    & 1.5       & 1.5       & 1.5       & 1.5 \\ %      & 1         & 1.5 \\
+    & 1.5       & 1.5       & 1.5       & 1.5       & 1.5 \\ %      & 1         & 1.5 \\
     \hline
     size
-    & 120       & 130       & 140       & 150  \\ %     & 171       & 171 \\
+    & 110       & 120       & 130       & 140       & 150  \\ %     & 171       & 171 \\
     \hline
     Precision
-    & \np{E-11} & \np{E-11} & \np{E-11} & \np{E-11} \\ % & \np{E-5}  & \np{E-5} \\
+    & \np{E-11} & \np{E-11} & \np{E-11} & \np{E-11} & \np{E-11} \\ % & \np{E-5}  & \np{E-5} \\
     \hline
     \hline
     Relative gain
-    & 2.51     & 2.58     & 2.55     & 2.54   \\ %  & 1.59      & 1.29 \\
+    & 2.53      & 2.51     & 2.58     & 2.55     & 2.54   \\ %  & 1.59      & 1.29 \\
     \hline
   \end{mytable}
 \end{table}
@@ -637,9 +637,9 @@ Note that the program was run with the following parameters:
 
 	- Processor unit power : 1.5 GFlops;
 
-	- Intracluster network : bandwidth = 1,25 Gbits/s and latency = 5E-05 ms;
+	- Intracluster network : bandwidth = 1,25 Gbits/s and latency = \np{E-5} ms;
 
-	- Intercluster network : bandwidth = 5 Mbits/s and latency = 5E-03 ms;
+	- Intercluster network : bandwidth = 5 Mbits/s and latency = 5.\np{E-3} ms;
 \end{itemize}