----------- REVIEW -----------
-The contribution of the paper could be better described.
-
-The authors state that:
-
-"we show that SimGrid is an efficient simulation
-tool that has enabled .."
-
-If this is one of the goals of the paper to present the
-capabilities/strength of SimGrid, then they should compare it with
-other tools for the comparison of the two methods.
-
-Regarding the comparison of the two methods, the possible scalability
-expected in the case of larger platforms might also be commended /
-discussed.
+,----
+| The contribution of the paper could be better described.
+|
+| The authors state that:
+|
+| "we show that SimGrid is an efficient simulation
+| tool that has enabled .."
+|
+| If this is one of the goals of the paper to present the
+| capabilities/strength of SimGrid, then they should compare it with
+| other tools for the comparison of the two methods.
+`----
+
+[RCE] L’objectif du papier n’est pas de comparer des outils de
+ simulation et d’arriver à une conclusion sur la performance de
+ Simgrid. Ce dernier a été choisi parmi d’autres pour effectuer
+ la comparaison entre les 2 algorithmes en mode async sur un
+ environnement de grille distribuée.
+
+ On peut modifier la phrase comme suit : "we show that SimGrid is
+ one of efficient simulation tool that has enabled .."
+
+
+,----
+| Regarding the comparison of the two methods, the possible scalability
+| expected in the case of larger platforms might also be commended /
+| discussed.
+`----
+
+[RCE] Je pense que ça a été commenté / discuté tout au long du papier
+ cette montée en charge possible sur des plateformes plus
+ larges. Même dans la conclusion, on a avancé que l’objectif est
+ de réussir à faire tourner le programme sur une plateforme plus
+ large (en terme de nombre de cœurs et de nombre de clusters)
+ mais aussi de pouvoir résoudre des problèmes de plus grande
+ taille.
----------------------- REVIEW 2 ---------------------
----------- REVIEW -----------
-This paper describes the simulation of an adapted (authors say
-slightly changed) GMRES solver on the SimGrid simulation framework;
-the GMRES solver is changed from synchronous iterative solution to a
-asynchronous iteration scheme in order to overcome latencies when
-interconnecting computers in a Grid environment.
-
-The prejudice of the paper is that the GMRES algorithm is not using
-non-blocking communication to begin with.
-
-
-You mention that for running with SimGrid using SMPI, "little" or no
-modification need to be done to the original code: what kind of
-modifications are necessary -- and did You have to apply any
-modification to run with SMPI? (in a later section of the paper,
-changing / deleting global variables were mentioned -- due to the
-threaded execution of simulated MPI processes...)
-
-
-SimGrid uses a "fluid model" -- what does that mean?
-
-The local convergence criterion (k<=MaxIter) seams wrong and should
-rather read: k == MaxIter?
-
-As far as the reviewer can tell, SMPI removes heavy computation by
-making assumptions on the CPU performance of the simulated code --
-which however is not true with most Grid environments where You do
-have mixed architectures and mixed performance characteristics. How
-is this handled?
-
-However, the main gripe about this paper is the rather unrealistic
-assumption on bandwidth (5 Mbps!) and latency (20ms): the internal
-network of a cluster may be Infiniband, with bw of Gigabytes/sec and
-micro-second latency, while a second cluster may be reachable over
-Gigabit-Ethernet with 100-200x the latency... This would be a setup,
-where a (even slight) gain would provide more convincing results.
-
-
-
-Some knitpicks include:
-- Abstract: "Behaviours", please no plural
-- Sec II (and others): "As exposed" --> "As described"
-- Sec II: "And important idle times" --> better "useless idle times
- used for synchronization"
-- Sec III: "by the mean of an XML file" --> "by means of an XML file".
-- SEC IV.B: did not encouter ... unless some code debugging" -->
- please rewrite the unless part...
-- SEC V: "Hosts processors power" --> "Host processor power"
+,----
+| This paper describes the simulation of an adapted (authors say
+| slightly changed) GMRES solver on the SimGrid simulation framework;
+| the GMRES solver is changed from synchronous iterative solution to a
+| asynchronous iteration scheme in order to overcome latencies when
+| interconnecting computers in a Grid environment.
+`----
+
+[RCE] Non, ce n’est pas tout à fait ça : on veut comparer l’algo GMRES
+ qui est executé en mode SYNC avec l’algo de multisplitting qui
+ lui sera executé en mode ASYNC.
+
+[LZK] Pas uniquement la comparaison !
+ Par la simulation sur SimGrid (et la comparaison des deux algorithmes),
+ on a montré que notre méthode est plus adaptée aux grilles distribuées.
+ En quelque sorte, on a bien modifié l'algorithme de GMRES pour l'adapter
+ aux clusters distants. On a utilisé des itérations asynchrones pour
+ recouvrir les communications par du calcul et le multisplittig pour
+ réduire le volume total des communications. De toute façon, on ne peut
+ pas appliquer les itérations asynchrones sur GMRES sans le multisplitting.
+ On peut bien sûr utiliser ces deux techniques sur d'autres méthodes
+ numériques de résolution.
+
+,----
+| The prejudice of the paper is that the GMRES algorithm is not using
+| non-blocking communication to begin with.
+`----
+
+[RCE] Comme dit juste plus haut, effectivement GMRES est resté SYNC
+ donc en mode de communication bloquant.
+
+
+,----
+| You mention that for running with SimGrid using SMPI, "little" or no
+| modification need to be done to the original code: what kind of
+| modifications are necessary -- and did You have to apply any
+| modification to run with SMPI? (in a later section of the paper,
+| changing / deleting global variables were mentioned -- due to the
+| threaded execution of simulated MPI processes...)
+`----
+
+[RCE] Les changements “mineurs” apportés sur le code lors de
+ l’exécution dans Simgrid/SMPI par rapport à un lancement sur un
+ environnement réel (MPI) se résument aux deux points suivants :
+ - Toutes les variables globales ont été ramenées dans un scope
+ local aux fonctions. Cette modification a entraîné le
+ changement des définitions synoptiques des fonctions pour
+ prendre en compte les passages de variables.
+ - La sequence MPI_ISend, MPI_Irecv and MPI_Waitall a pose aussi
+ un problème en mode Async. Elle a été remplacée par une
+ sequence de 6 Isend/Irecv/Wait à la place.
+
+ On peut donc faire un renvoi à la Section III pour clarifier :
+ « The SMPI interface implements about 80% of the MPI 2.0
+ standard [?] and supports applications written in C or
+ Fortran, with little or no modifications. »
+ On écrira :
+ « The SMPI interface implements about 80% of the MPI 2.0
+ standard [?] and supports applications written in C or
+ Fortran, with little or no modifications. (cf Section IV
+ paragraph B) »
+
+
+,----
+| SimGrid uses a "fluid model" -- what does that mean?
+`----
+
+[RCE] Arnaud peut-il aider ici ?
+ [AG] Je fais.
+
+,----
+| The local convergence criterion (k<=MaxIter) seams wrong and should
+| rather read: k == MaxIter?
+`----
+
+[RCE] Je pense que le reviewer a raison. Lilia ?
+
+,----
+| As far as the reviewer can tell, SMPI removes heavy computation by
+| making assumptions on the CPU performance of the simulated code --
+| which however is not true with most Grid environments where You do
+| have mixed architectures and mixed performance characteristics. How
+| is this handled?
+`----
+
+[RCE] Simgrid/SMPI prévoit cette hétérogénéité des composants des
+ clusters dans une grille par la définition plus ou moins fine
+ des caractéristiques des nœuds composant les clusters (puissance
+ CPU, mémoire RAM, …) d’une part mais aussi par la description
+ plus ou moins détaillée aussi du réseau de communication entre
+ les clusters de la grille.
+
+,----
+| However, the main gripe about this paper is the rather unrealistic
+| assumption on bandwidth (5 Mbps!) and latency (20ms): the internal
+| network of a cluster may be Infiniband, with bw of Gigabytes/sec and
+| micro-second latency, while a second cluster may be reachable over
+| Gigabit-Ethernet with 100-200x the latency... This would be a setup,
+| where a (even slight) gain would provide more convincing results.
+`----
+
+[RCE] Il faut qu’on précise que ces caractéristiques de réseau “non
+ réalistes” concernent le réseau INTER cluster. Le réseau INTRA
+ cluster sont bien dans l’ordre de grandeur donnée (Gbps de bw et
+ ms de latence). Toutefois, le reviewer a bien vu qu’on a poussé
+ trop fort sur le réseau inter-cluster ☺ Mais ce n’est qu’à ce
+ prix qu’on a commencé à avoir un gain appréciable.
+
+
+
+,----
+| Some knitpicks include:
+| - Abstract: "Behaviours", please no plural
+| - Sec II (and others): "As exposed" --> "As described"
+| - Sec II: "And important idle times" --> better "useless idle times
+| used for synchronization"
+| - Sec III: "by the mean of an XML file" --> "by means of an XML file".
+| - SEC IV.B: did not encouter ... unless some code debugging" -->
+| please rewrite the unless part...
+| - SEC V: "Hosts processors power" --> "Host processor power"
+`----
+
+[RCE] On va prendre en compte ces remarques.
+ [AG] J'ai commencé pour les plus faciles.
----------------------- REVIEW 3 ---------------------
----------- REVIEW -----------
-The submitted paper purports to be the first simulation of
-asynchronous iterative algorithms and predicts that, for a particular
-cluster configurations with very high latency (20ms) and very low
-bandwidths (5/50 Mbit/s), an unpreconditioned asynchronous
-multisplitting algorithm will be faster than an unpreconditioned GMRES
-algorithm for solving a 3D Poisson equation.
-
-Several issues with respect to the relevance of these results deserve
-discussion:
-
-1) There is no substantial discussion of the fundamental additions to
-SimGrid that were required in order to support the simulation of
-asynchronous iterative algorithms. If no extensions were required,
-then I am unsure as to how this aspect of the work is a contribution.
-
-2) The model problem of a 3D Poisson equation with no preconditioner
-is regrettable due to the large number of fast solvers available that
-have been available for many decades. For this reason, as is, the
-results are not relevant to the solution of PDEs. However, a similar
-computational structure appears within the context of gradient descent
-methods for the solution of convex optimization problems, and
-asynchronous algorithms are quite common. I would humbly suggest such
-a model problem in the future unless either a more challenging PDE is
-tackled or a non-trivial preconditioner is incorporated.
-
-3) This is somewhat of a minor point, but I did not see an explicit
-discussion of the link between a global relative residual norm,
-|| A x - b|| / || b ||, and the local convergence criterion used in
-the asynchronous algorithm, which tested for the infinity norm of the
-local computation. When "precision" is reported in Table I, is it
-referring to a consistent global convergence criterion? And, if so,
-what is it precisely referring to?
-
-4) Typical latencies within clusters are on the order of a
-microsecond, and the latency used to produce Table I is more than
-three orders of magnitude higher (20ms). It would be helpful if more
-justification was given for why such a high latency is
-relevant. Furthermore, the chosen bandwidths (5 Mbit/s and 50 Mbit/s)
-are closer to a non-commercial home internet connection than a
-commercial ethernet connection.
-
-Overall, I feel that a significant number of issues should be
-addressed before publication would be warranted.
+,----
+| The submitted paper purports to be the first simulation of
+| asynchronous iterative algorithms and predicts that, for a particular
+| cluster configurations with very high latency (20ms) and very low
+| bandwidths (5/50 Mbit/s), an unpreconditioned asynchronous
+| multisplitting algorithm will be faster than an unpreconditioned GMRES
+| algorithm for solving a 3D Poisson equation.
+|
+| Several issues with respect to the relevance of these results deserve
+| discussion:
+|
+| 1) There is no substantial discussion of the fundamental additions to
+| SimGrid that were required in order to support the simulation of
+| asynchronous iterative algorithms. If no extensions were required,
+| then I am unsure as to how this aspect of the work is a contribution.
+`----
+
+[RCE] Il n’y avait pas d’extensions apportées à SIMGRID pour résoudre
+ le type d’algorithme choisi.
+
+,----
+| 2) The model problem of a 3D Poisson equation with no preconditioner
+| is regrettable due to the large number of fast solvers available that
+| have been available for many decades. For this reason, as is, the
+| results are not relevant to the solution of PDEs. However, a similar
+| computational structure appears within the context of gradient descent
+| methods for the solution of convex optimization problems, and
+| asynchronous algorithms are quite common. I would humbly suggest such
+| a model problem in the future unless either a more challenging PDE is
+| tackled or a non-trivial preconditioner is incorporated.
+`----
+
+[RCE] ??
+
+,----
+| 3) This is somewhat of a minor point, but I did not see an explicit
+| discussion of the link between a global relative residual norm,
+| || A x - b|| / || b ||, and the local convergence criterion used in
+| the asynchronous algorithm, which tested for the infinity norm of the
+| local computation. When "precision" is reported in Table I, is it
+| referring to a consistent global convergence criterion? And, if so,
+| what is it precisely referring to?
+`----
+
+[RCE] Selon ma comprehension, la “precision” de la table I est la
+ “tolerance threshold” (epsilon) mentionnée dans la Section
+ IV. Il permet effectivement de determiner le critère ou la
+ condition de convergence globale. Lilia peut confirmer ?
+
+,----
+| 4) Typical latencies within clusters are on the order of a
+| microsecond, and the latency used to produce Table I is more than
+| three orders of magnitude higher (20ms). It would be helpful if more
+| justification was given for why such a high latency is
+| relevant. Furthermore, the chosen bandwidths (5 Mbit/s and 50 Mbit/s)
+| are closer to a non-commercial home internet connection than a
+| commercial ethernet connection.
+`----
+
+[RCE] Voir remarques plus haut.
+
+,----
+| Overall, I feel that a significant number of issues should be
+| addressed before publication would be warranted.
+`----
----------------------- REVIEW 4 ---------------------
----------- REVIEW -----------
-This is a very interesting paper devoted to the implementation in a
-grid environment of some asynchronous algorithm. These algorithms are
-indeed very powerfull, and the more latency, the more efficient are
-these algorithms. A comparison of a synchronous GMRES and an
-asynchronous multi-splitting is presented. The obtained results are
-interesting and confirm the efficiency of these methods.
+,----
+| This is a very interesting paper devoted to the implementation in a
+| grid environment of some asynchronous algorithm. These algorithms are
+| indeed very powerfull, and the more latency, the more efficient are
+| these algorithms. A comparison of a synchronous GMRES and an
+| asynchronous multi-splitting is presented. The obtained results are
+| interesting and confirm the efficiency of these methods.
+`----
+
+[RCE] Bien compris.
----------------------- REVIEW 5 ---------------------
----------- REVIEW -----------
-This paper is a mix between a short and a long paper, it presents
-preliminary works on simulation of asynchronous iterative algorithms
-using SimGrid. I recommend to accept it as a short paper.
-
-
+,----
+| This paper is a mix between a short and a long paper, it presents
+| preliminary works on simulation of asynchronous iterative algorithms
+| using SimGrid. I recommend to accept it as a short paper.
+`----