From ce28aee44b381b6b1c65db0e3076ecc8532b4f2b Mon Sep 17 00:00:00 2001 From: Arnaud Giersch Date: Thu, 19 Feb 2015 09:15:57 +0100 Subject: [PATCH] Add review from PDSEC 2015. --- pdsec15_review.txt | 331 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 331 insertions(+) create mode 100644 pdsec15_review.txt diff --git a/pdsec15_review.txt b/pdsec15_review.txt new file mode 100644 index 0000000..7f5b887 --- /dev/null +++ b/pdsec15_review.txt @@ -0,0 +1,331 @@ +============================== Standard 1 ============================== + +> *** Key Contributions: Please describe the key contributions of the + paper or lack thereof. Your comments should be specific and + justify your overall recommendation. + +This paper presents a new online frequency selecting algorithm for +distributed iterative applications running on heterogeneous CPU nodes. +Contrary to previous work (for homogeneous CPU), this heterogeneous +context implies a vector of scaling factors and "slack times" before +synchronizing the processes at each iteration. The models and the +algorithm are clearly presented and detailed, and are validated on +several benchmarks thanks to a simulator. Comparison with another +scaling factor selection algorithm (which does not take into account +communication times and heterogeneity) shows the relevance of this new +algorithm which manages to significantly reduce the energy consumption +with acceptable performance overhead. + +Overall, this is a very solid work, and the paper is well-written and +very clear. + +The main flaw of this paper is that the evaluation is only done via a +simulator. As mentioned in future work, evaluations on real +heterogeneous CPU platforms (with real power measurements) will be +necessary (as future work) to validate definitely this algorithm and +the models. + +> *** Suggestions for Improvement: Additional comments and suggestions + for improvement in the technical content or the presentation. + Please be as detailed and constructive as you can be. + +The energy and performance models rely on compute-bound programs, +where the computation time is linearly proportional to the processor +frequency. Does this apply to all NAS benchmarks ? The authors should +specify which NAS benchmarks are memory-bound (if any), and how their +model apply to these memory-bound benchmarks. + +Moreover, in section III it seems that the authors assume that the +communication time (without slack time) is the same for all processors +provided they have the same communication volume. This could be +pointed out more clearly in the paper. Also, does this apply to all +NAS benchmarks? Does it also depends on the placement of the MPI +processes? I assume that for the same communication volume, the +communication time will differ whether the processes are on +neighbouring nodes or are on distant nodes (especially with 128 or 144 +nodes). +Could the authors discuss in the text? + +The authors consider that the communication time only apply to static +power, which means that no CPU cycle is used for the MPI +communications. Does this implies specific networks (like Infiniband) +with RDMA? +This could be clarified in the paper. + +Finally, the algorithm applies to synchronous iterative applications: +is this the case for all NAS benchmarks evaluated in this paper? This +could also be specified in the paper. + +Figures 2a and 2b : I do not understand why the energy curve in Fig.2b +does not have the same shape as the one in Fig.2a. +Could the authors specify this in the text? + +Minor comments : +- The authors could specify in the abstract that "heterogeneous + platforms" refer to heterogeneous CPUs (not to CPU-GPU nodes). +- The terms "in the same direction" (used twice in section IV) are + unclear and should be rewritten. +- Section V.A : replace "because selecting frequency scaling factors + higher than the higher bound" by "because selecting frequencies + higher than the higher bound"? + +> *** Significance: Assess the significance of the topic addressed in + the paper. + +Excellent (5) + +> *** Originality/Novelty (of contribution): How novel are the + concepts presented in the paper? + +Above average (4) + +> *** Technical Soundness: How strong are the techniques and + methodologies used in the paper? + +Excellent (5) + +> *** Overall Recommendation: Your final rating should be consistent + with your ratings on previous questions. + +Accept (5) + +============================== Standard 2 ============================== + +> *** Key Contributions: Please describe the key contributions of the + paper or lack thereof. Your comments should be specific and + justify your overall recommendation. + +The paper proposed a frequency selection algorithm for heterogeneous +platforms. The algorithm proposed the maximum distance between the +energy consumption and the performance to get the trade off scale +factor. on This is an interesting paper with good trial to cover many +factors. + +The paper ran NPB benchmarks to verify the algorithm but there is no +comparison between the results at the the trade-off scale factor and +those from all other possible scale factors without applying the +algorithm. Without this, it is not reliable to validate the algorithm. + +> *** Suggestions for Improvement: Additional comments and suggestions + for improvement in the technical content or the presentation. + Please be as detailed and constructive as you can be. + +There are too much tables i.e. II-VII in section VI. Better to +summarize them in a couple of figures. + +It is necessary to describe the overhead of the algorithm which is +missed in the paper. + +> *** Significance: Assess the significance of the topic addressed in + the paper. + +Average (3) + +> *** Originality/Novelty (of contribution): How novel are the + concepts presented in the paper? + +Average (3) + +> *** Technical Soundness: How strong are the techniques and + methodologies used in the paper? + +Acceptable (3) + +> *** Overall Recommendation: Your final rating should be consistent + with your ratings on previous questions. + +Weak Accept (4) + +============================== Standard 3 ============================== + +> *** Key Contributions: Please describe the key contributions of the + paper or lack thereof. Your comments should be specific and + justify your overall recommendation. + +The paper develops DVFS performance models and an online algorithm to +optimize time and energy for iterative message passing applications on +a heterogeneous CPU cluster. An objective function is developed to +express the time energy tradeoff. Results using a simulated framework +show worthwhile energy gains for acceptable loss of execution time. A +comparison with a more general pre-existing algorithm show modest +improvements in energy and and energy-time tradeoff. + +The paper is well-written and is technically sound. Its significance +is slightly diminished due to the fact that previous work has largely +dealt with this issue on scenarios that are of stronger interests +and/or are less specialized. + +> *** Suggestions for Improvement: Additional comments and suggestions + for improvement in the technical content or the presentation. + Please be as detailed and constructive as you can be. + +The abstract would be sharpened it it contained numbers relating to +the performance degradation and comparison. + +III.A. The modelling of the communication time being independent of +the frequency is questionable, even if it is backed up by a 10year old +reference. While slack time is not affected, my own research has shown +that communication bandwidth does clearly increase with frequency, +albeit in a sub-linear fashion. The use of taking the minimum for +communication time (3) needs better explanation, as it is +counter-intuitive. + +I would like to some explanation as to why it takes so many iterations +for the algorithm to select the best vector, and whether this can be +improved. While the NAS benchmarks have a standard number of +iterations, it would be helpful to the reader to indicate what these +are in VI. + +The results on a real heterogeneous platform in the future work will +be interesting. + +There are a number of small grammatical errors: + +p2. ``to satisfy some objectives while taking into account all the +constraints,'': a comma is needed before `while' to match the 2nd + +Fig2(b) normalize -> normalized + +p4 ``following the same direction'': use `follow' + +Alg1: F_diff_i: difference -> differences + +p6: on all left frequencies -> on all remaining frequencies + +while it lowers the frequency of all other nodes -> +while it lowers the frequencies of all other nodes + +``the proposed algorithm is not an exact method it does'': +put a : before it + +p8: on different number of nodes -> on different numbers of nodes +the GC benchmark significantly decrease -> +the CG benchmark significantly decreases + +> *** Significance: Assess the significance of the topic addressed in + the paper. + +Above average (4) + +> *** Originality/Novelty (of contribution): How novel are the + concepts presented in the paper? + +Above average (4) + +> *** Technical Soundness: How strong are the techniques and + methodologies used in the paper? + +Excellent (5) + +> *** Overall Recommendation: Your final rating should be consistent + with your ratings on previous questions. + +Strong Accept (6) + +============================== Standard 4 ============================== + +> *** Key Contributions: Please describe the key contributions of the + paper or lack thereof. Your comments should be specific and + justify your overall recommendation. + +In this paper, a new online frequency selecting algorithm for +heterogeneous platforms is presented. It selects the frequencies and +tries to give the best trade-off between energy saving and performance +degradation, for each node computing the message passing iterative +application. The algorithm has a small overhead and works without +training or profiling. It uses a new energy model for message passing +iterative applications running on a het- erogeneous platform. The +proposed algorithm is evaluated on the SimGrid simulator while running +the NAS parallel benchmarks. The experiments show that it reduces the +energy consumption by up to 35 % while limiting the performance +degradation as much as possible. Finally, the algorithm is compared to +an existing method, the comparison results showing that it outperforms +the latter. + +> *** Suggestions for Improvement: Additional comments and suggestions + for improvement in the technical content or the presentation. + Please be as detailed and constructive as you can be. + +I did not see every clearly that if the proposed online algorithm can +achieve the optimal selection. If only the heustrics, then how close +to the optimal? I would like to see more theoretical or experimental +results if possible since the authors claims the "the best trade-off +between energy saving and performance degradation". + +> *** Significance: Assess the significance of the topic addressed in + the paper. + +Excellent (5) + +> *** Originality/Novelty (of contribution): How novel are the + concepts presented in the paper? + +Excellent (5) + +> *** Technical Soundness: How strong are the techniques and + methodologies used in the paper? + +Strong (4) + +> *** Overall Recommendation: Your final rating should be consistent + with your ratings on previous questions. + +Strong Accept (6) + +============================== Standard 5 ============================== + +> *** Key Contributions: Please describe the key contributions of the + paper or lack thereof. Your comments should be specific and + justify your overall recommendation. + +The paper considers the DVFS technique and presents an energy model +for DVFS systems that also takes the communication time into +consideration. An new algorithm for selecting the scaling factors is +presented. The algorithm uses a vector of scaling factors, one for +each node, and determines the scaling factors such that best trade-off +between minimizing the energy consumption and maximizing the +performance for a synchronous iterative algorithm is reached. The +algorithm works during execution time and uses the first interation +step for collecting the information required for the scaling factor +selection. An experimental evaluation is given using the SimGrid +environment. + +The paper is well written and structured and should be accepted. It +is solid work and provides new contributions by extending earlier +energy models with communication time concerns and proposes a new +algorithm for DVFS control. + +> *** Suggestions for Improvement: Additional comments and suggestions + for improvement in the technical content or the presentation. + Please be as detailed and constructive as you can be. + +Algorithm 1 in Section V could be explained in more detail. As far as +I can see, it tests all possible frequencies or scaling factors for +the different nodes and selects the best as indicated by the model. I +was wondering whether all combinations of scaling factors are tested +or whether this is not necessary because of the behavior of the +communication. +The accuracy of the frequency selection depends on the accuracy of the +model used for the computation of the scaling factors. It would be +interesting to see how accurate the model is for real systems. +However, I see that this might be difficult to capture in practice. + +> *** Significance: Assess the significance of the topic addressed in + the paper. + +Excellent (5) + +> *** Originality/Novelty (of contribution): How novel are the + concepts presented in the paper? + +Above average (4) + +> *** Technical Soundness: How strong are the techniques and + methodologies used in the paper? + +Excellent (5) + +> *** Overall Recommendation: Your final rating should be consistent + with your ratings on previous questions. + +Accept (5) -- 2.39.5