1 /*! \page publis Publications
3 \section pub_reference Reference publication about SimGrid
5 When citing SimGrid, the prefered reference paper is <i>Scheduling
6 Distributed Applications: the SimGrid Simulation Framework</i>, even if it's
7 a bit old now. We are actively working on improving this.
9 \li <b>Scheduling Distributed Applications: the
10 SimGrid Simulation Framework</b>\n
11 by <em>Henri Casanova and Arnaud Legrand and Loris Marchal</em>\n
12 Proceedings of the third IEEE International Symposium
13 on Cluster Computing and the Grid (CCGrid'03)\n
14 Since the advent of distributed computer systems an active field
15 of research has been the investigation of scheduling strategies
16 for parallel applications. The common approach is to employ
17 scheduling heuristics that approximate an optimal
18 schedule. Unfortunately, it is often impossible to obtain
19 analytical results to compare the efficacy of these heuristics.
20 One possibility is to conducts large numbers of back-to-back
21 experiments on real platforms. While this is possible on
22 tightly-coupled platforms, it is infeasible on modern distributed
23 platforms (i.e. Grids) as it is labor-intensive and does not
24 enable repeatable results. The solution is to resort to
25 simulations. Simulations not only enables repeatable results but
26 also make it possible to explore wide ranges of platform and
27 application scenarios.\n
28 In this paper we present the SimGrid framework which enables the
29 simulation of distributed applications in distributed computing
30 environments for the specific purpose of developing and evaluating
31 scheduling algorithms. This paper focuses on SimGrid v2, which
32 greatly improves on the first version of the software with more
33 realistic network models and topologies. SimGrid v2 also enables
34 the simulation of distributed scheduling agents, which has become
35 critical for current scheduling research in large-scale platforms.
36 After describing and validating these features, we present a case
37 study by which we demonstrate the usefulness of SimGrid for
38 conducting scheduling research.\n
39 http://www-id.imag.fr/Laboratoire/Membres/Legrand_Arnaud/articles/simgrid2_CCgrid03.pdf
41 Previous publication do not cover the GRAS part of the framework. So, if you
42 want to cite GRAS, please use this publication instead:
44 \li <b>Gras: A Research & Development Framework for Grid and P2P
46 by <em>Martin Quinson</em>\n
47 <b>Best paper</b> of the 18th IASTED International Conference on
48 Parallel and Distributed Computing and Systems (PDCS 2006)\n
49 http://www.loria.fr/~quinson/articles/gras-iasted06.pdf
51 \section pub_simulation Other publications about the SimGrid framework
53 \li <b>Speed and Accuracy of Network Simulation in the SimGrid Framework</b>\n
54 by <em>K. Fujiwara, H. Casanova</em>\n
55 in Proceedings of the First International Workshop on Network Simulation Tools (NSTools), Nantes, France, October 2007.\n
56 http://navet.ics.hawaii.edu/~casanova/homepage/papers/fujiwara_nstool2007.pdf
58 \li <b>Cost and Accuracy of Packet-Level vs. Analytical Network Simulations: An Empirical Study</b>\n
59 by <em>K. Fujiwara</em>\n
60 <b>M.S. Thesis</b>, Dept. of Information and Computer Sciences, University of Hawai`i at Manoa, April 2007.\n
61 http://navet.ics.hawaii.edu/~casanova/homepage/theses/kayo_fujiwara_MS.pdf
63 \li <b>The SimGrid Project - Simulation and Deployment of Distributed Applications</b>\n
64 by <em>A. Legrand, M. Quinson, K. Fujiwara, H. Casanova</em>\n
65 <b>POSTER</b> in Proceedings of the IEEE International Symposium on High Performance Distributed Computing (HPDC-15), Paris, France, May 2006.\n
67 <a href="http://navet.ics.hawaii.edu/~casanova/homepage/papers/simgrid_hpdc06.pdf"><img src="poster_thumbnail.png" /></a>
69 http://navet.ics.hawaii.edu/~casanova/homepage/papers/simgrid_hpdc06.pdf
71 \li <b>A Network Model for Simulation of Grid Application</b>\n
72 by <em>Henri Casanova and Loris Marchal</em>\n
74 In this work we investigate network models that can be
75 potentially employed in the simulation of scheduling algorithms for
76 distributed computing applications. We seek to develop a model of TCP
77 communication which is both high-level and realistic. Previous research
78 works show that accurate and global modeling of wide-area networks, such
79 as the Internet, faces a number of challenging issues. However, some
80 global models of fairness and bandwidth-sharing exist, and can be link
81 withthe behavior of TCP. Using both previous results and simulation (with
82 NS), we attempt to understand the macroscopic behavior of
83 TCP communications. We then propose a global model of the network for the
84 Grid platform. We perform partial validation of this model in
85 simulation. The model leads to an algorithm for computing
86 bandwidth-sharing. This algorithm can then be implemented as part of Grid
87 application simulations. We provide such an implementation for the
88 SimGrid simulation toolkit.\n
89 ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-40.ps.gz
92 \li <b>MetaSimGrid : Towards realistic scheduling simulation of
93 distributed applications</b>\n
94 by <em>Arnaud Legrand and Julien Lerouge</em>\n
95 Most scheduling problems are already hard on homogeneous
96 platforms, they become quite intractable in an heterogeneous
97 framework such as a metacomputing grid. In the best cases, a
98 guaranteed heuristic can be found, but most of the time, it is
99 not possible. Real experiments or simulations are often
100 involved to test or to compare heuristics. However, on a
101 distributed heterogeneous platform, such experiments are
102 technically difficult to drive, because of the genuine
103 instability of the platform. It is almost impossible to
104 guarantee that a platform which is not dedicated to the
105 experiment, will remain exactly the same between two tests,
106 thereby forbidding any meaningful comparison. Simulations are
107 then used to replace real experiments, so as to ensure the
108 reproducibility of measured data. A key issue is the
109 possibility to run the simulations against a realistic
110 environment. The main idea of trace-based simulation is to
111 record the platform parameters today, and to simulate the
112 algorithms tomorrow, against the recorded data: even though it
113 is not the current load of the platform, it is realistic,
114 because it represents a fair summary of what happened
115 previously. A good example of a trace-based simulation tool is
116 SimGrid, a toolkit providing a set of core abstractions and
117 functionalities that can be used to easily build simulators for
118 specific application domains and/or computing environment
119 topologies. Nevertheless, SimGrid lacks a number of convenient
120 features to craft simulations of a distributed application
121 where scheduling decisions are not taken by a single
122 process. Furthermore, modeling a complex platform by hand is
123 fastidious for a few hosts and is almost impossible for a real
124 grid. This report is a survey on simulation for scheduling
125 evaluation purposes and present MetaSimGrid, a simulator built
127 ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-28.ps.gz
129 \li <b>SimGrid: A Toolkit for the Simulation of Application
131 by <em>Henri Casanova</em>\n
132 Advances in hardware and software technologies have made it
133 possible to deploy parallel applications over increasingly large
134 sets of distributed resources. Consequently, the study of
135 scheduling algorithms for such applications has been an active area
136 of research. Given the nature of most scheduling problems one must
137 resort to simulation to effectively evaluate and compare their
138 efficacy over a wide range of scenarios. It has thus become
139 necessary to simulate those algorithms for increasingly complex
140 distributed, dynamic, heterogeneous environments. In this paper we
141 present SimGrid, a simulation toolkit for the study of scheduling
142 algorithms for distributed application. This paper gives the main
143 concepts and models behind SimGrid, describes its API and
144 highlights current implementation issues. We also give some
145 experimental results and describe work that builds on SimGrid's
147 http://grail.sdsc.edu/papers/simgrid_ccgrid01.ps.gz
149 \section pub_ext Papers that use SimGrid-generated results (not counting our owns)
151 This list is a selection of articles. We list only papers written by people
152 external to the development group, but we also use our tool ourselves (see
156 - <b>Scheduling Δ-Critical Tasks in Mixed-Parallel Applications on a National Grid</b>\n
157 by <em>Frédéric Suter</em>.\n
158 In 8th IEEE/ACM International Conference on Grid Computing (Grid 2007), Austin, TX, September 2007.
161 - <b>Simbatch: an API for simulating and predicting the performance of parallel resources and batch systems.</b>\n
162 INRIA Research Report 6040, November 2006.\n
163 https://hal.inria.fr/inria-00115880
164 - <b>Simbatch : une API pour la simulation et la prédiction de performances de systèmes batch</b>\n
165 by <em>Jean-Sébastien Gay and Yves Caniou</em>.\n
166 In 17ème Rencontres Francophones du Parallélisme, des Architectures et des Systèmes, RenPar'17.\n
167 October 4-6, Perpignan, France
168 - <b>Metascheduling Multiple Resource Types using the MMKP</b>\n
169 by <em>D. Vanderster, N. Dimopoulos, R. Sobie</em>\n
170 7th IEEE/ACM International Conference on Grid Computing\n
171 Barcelona, September 28th-29th 2006
172 - <b>Master-Slave Tasking on Asymmetric Networks</b>\n
173 by <em>Cyril Banino-Rokkones, Olivier Beaumont and Lasse Natvig</em>.\n
174 In Proceedings of 12th International Euro-Par Conference, Euro-Par 2006.\n
175 August 29 - September 1, Pages 167--176, Dresden, Germany.
176 - <b>Critical Path and Area Based Scheduling of Parallel Task Graphs on Heterogeneous Platforms</b>\n
177 by <em>Tchimou N'Takpé and Frédéric Suter</em>\n
178 Proceedings of the Twelfth International Conference on Parallel and Distributed Systems (ICPADS)\n
179 Minneapolis, MN, July 12-15, 2006.
180 - <b>Sensitivity Analysis of Knapsack-based Task Scheduling on the Grid</b>\n
181 by <em>D.C. Vanderster and N.J. Dimopoulos</em>.\n
182 In Proceedings of The 20th ACM International Conference on Supercomputing\n
183 Cairns, Australia, June 28-July 1, 2006.\n
184 http://portal.acm.org/citation.cfm?id=1183401.1183446&coll=GUIDE&dl=%23url.coll
185 - <b>Hierarchical Scheduling of Independent Tasks with Shared Files</b>\n
186 by <em>H. Senger, F. Silva, W. Nascimento</em>.\n
187 Proceedings of the Sixth IEEE International Symposium on Cluster
188 Computing and the Grid Workshop (CCGRIDW'06)\n
189 Singapore, 16-19 May 2006.\n
190 http://www.unisantos.br/mestrado/informatica/hermes/File/senger-HierarchicalScheduling-Workshop-TB120.pdf
191 - <b>Evaluation of Knapsack-based Scheduling using the NPACI JOBLOG</b>\n
192 by <em>D. Vanderster, N. Dimopoulos, R. Parra-Hernandez and R. Sobie</em>.\n
193 20th International Symposium on High-Performance Computing in an
194 Advanced Collaborative Environment (HPCS'06)\n
195 St. John's, Newfoundland, Canada, 14-17 May 2006\n
196 http://doi.ieeecomputersociety.org/10.1109/HPCS.2006.23
199 - <b>On Dynamic Resource Management Mechanism using Control
200 Theoretic Approach for Wide-Area Grid Computing</b>\n
201 by <em>Hiroyuki Ohsaki, Soushi Watanabe, and Makoto Imase</em>\n
202 in Proceedings of IEEE Conference on Control Applications (CCA 2005), Aug. 2005.\n
203 http://www.ispl.jp/~oosaki/papers/Ohsaki05_CCA.pdf
204 - <b>Evaluation of Meta-scheduler Architectures and Task Assignment Policies for
205 high Throughput Computing</b>\n
206 by <em>Eddy Caron, Vincent Garonne and Andrei Tsaregorodtsev</em>\n
207 Proceedings of 4th Internationnal Symposium on Parallel and
208 Distributed Computing Job Scheduling Strategies for Parallel
209 Processing (ISPDC'05), July 2005.\n
210 http://www.ens-lyon.fr/LIP/Pub/Rapports/RR/RR2005/RR2005-27.pdf
212 - <b>Deadline Scheduling with Priority for Client-Server Systems on the Grid</b>\n
213 by <em>E Caron, PK Chouhan, F Desprez</em>\n
214 in IEEE International Conference On Grid Computing. Super Computing 2004, oct 2004.
215 - <b>Efficient Scheduling Heuristics for GridRPC Systems</b>\n
216 by <em>Y. Caniou and E. Jeannot.</em>\n
217 in IEEE QoS and Dynamic System workshop (QDS) of International Conference
218 on Parallel and Distributed Systems (ICPADS), New-Port Beach California, USA,
219 pages 621-630, July 2004\n
220 http://graal.ens-lyon.fr/~ycaniou/QDS04.ps
221 - <b>Exploiting Replication and Data Reuse to Efficiently Schedule
222 Data-intensive Applications on Grids</b>\n
223 by <em> E. Santos-Neto, W. Cirne, F. Brasileiro, A. Lima.</em>\n
224 Proceedings of 10th Job Scheduling Strategies for Parallel Processing, June 2004.\n
225 http://www.lsd.ufcg.edu.br/~elizeu/articles/jsspp.v6.pdf
226 - <b>Resource Management and Knapsack Formulations on the Grid</b>\n
227 by <em>R. Parra-Hernandez, D. Vanderster and N. J. Dimopoulos</em>\n
228 Fifth IEEE/ACM International Workshop on Grid Computing (GRID'04)\n
229 http://doi.ieeecomputersociety.org/10.1109/GRID.2004.54
230 - <b>Scheduling BoT Applications in Grids using a Slave Oriented Adaptive
232 by <em>T. Ferreto, C. A. F. De Rose and C. Northfleet.</em>\n
233 Second International Symposium on Parallel and Distributed Processing
234 and Applications (ISPA), 2004, Hong Kong. Published in Lecture Notes in
235 Computer Science (LNCS), Volume 3358, by Springer-Verlag. p. 392-398.
237 - <b>Link-Contention-Aware Genetic Scheduling Using Task Duplication in Grid Environments</b>\n
238 by <em>Wensheng Yao, Xiao Xie and Jinyuan You</em>\n
239 in Grid and Cooperative Computing: Second International Workshop, GCC 2003, Shanghai, China, December 7-10, 2003 (LNCS)\n
240 http://www.chinagrid.edu.cn/chinagrid/download/GCC2003/pdf/266.pdf
241 - <b>New Dynamic Heuristics in the Client-Agent-Server Model</b>\n
242 by <em>Y. Caniou and E. Jeannot</em>\n
243 in IEEE 13th Heteregeneous Computing Workshop - HCW'03, Nice, France, April 2003.\n
244 http://graal.ens-lyon.fr/~ycaniou/HCW03.ps
245 - <b>A Hierarchical Resource Reservation Algorithm for Network Enabled Servers</b>\n
246 by <em>E. Caron, F. Desprez, F. Petit, V. Villain</em>\n
247 in the 17th International Parallel and Distributed Processing Symposium -- IPDPS'03, Nice - France, April 2003.
249 \section pub_self Our own papers that use SimGrid-generated results
251 This list is a selection of the articles we have written that used results
252 generated by SimGrid.
255 - <b>A Comparison of Scheduling Approaches for Mixed-Parallel Applications on Heterogeneous Platforms</b>\n
256 by <em>T. N'takpé, F. Suter, and Henri Casanova</em>\n
257 In 6th International Symposium on Parallel and Distributed Computing, Hagenberg, Austria, July 2007.
259 - <b>On the Harmfulness of Redundant Batch Requests</b>\n
260 by <em>H. Casanova</em>\n
261 Proceedings of the IEEE International Symposium on High Performance Distributed Computing (HPDC-15), Paris, France, May 2006.\n
262 http://navet.ics.hawaii.edu/~casanova/homepage/papers/hpdc_2006.pdf
263 - <b>An evaluation of Job Scheduling Strategies for Divisible Loads on Grid Platforms</b>\n
264 by <em>Y. Cardinale, H. Casanova</em>\n
265 in Proceedings of the High Performance Computing & Simulation Conference (HPC&S'06), Bonn, Germany, May 2006.\n
266 http://navet.ics.hawaii.edu/~casanova/homepage/papers/cardinale_2006.pdf
267 - <b>Interference-Aware Scheduling</b>\n
268 by <em>B. Kreaseck, L. Carter, H. Casanova, J. Ferrante, S. Nandy</em>\n
269 International Journal of High Performance Computing Applications (IJHPCA), to appear.\n
270 http://navet.ics.hawaii.edu/~casanova/homepage/papers/kreaseck_ijhpca_2005.pdf
272 - <b>From Heterogeneous Task Scheduling to Heterogeneous Mixed Data and Task Parallel Scheduling</b>\n
273 by <em>F. Suter, V. Boudet, F. Desprez, H. Casanova</em>\n
274 Proceedings of Europar, 230--237, (LCNS volume 3149), Pisa, Italy, August 2004.
275 - <b>On the Interference of Communication on Computation</b>\n
276 by <em>B. Kreaseck, L. Carter, H. Casanova, J. Ferrante</em>\n
277 Proceedings of the workshop on Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems, Santa Fe, April 2004.\n
278 http://navet.ics.hawaii.edu/~casanova/homepage/papers/k_pmeo2004.pdf
280 - <b>RUMR: Robust Scheduling for Divisible Workloads</b>\n
281 by <em>Y. Yang, H. Casanova</em>\n
282 Proceedings of the 12th IEEE Symposium on High Performance and Distributed Computing (HPDC-12), Seattle, June 2003.\n
283 http://navet.ics.hawaii.edu/~casanova/homepage/papers/yang_hpdc2003.pdf
284 - <b>Resource Allocation Strategies for Guided Parameter Space Searches</b>\n
285 by <em>M. Faerman, A. Birnbaum, F. Berman, H. Casanova</em>\n
286 International Journal of High Performance Computing Applications (IJHPCA), 17(4), 383--402, 2003.\n
287 http://grail.sdsc.edu/papers/faerman_ijhpca04.pdf
289 - <b>Resource Allocation for Steerable Parallel Parameter Searches</b>\n
290 by <em>M. Faerman, A. Birnbaum, H. Casanova, F. Berman</em>\n
291 Proceedings of the Grid Computing Workshop, Baltimore, 157--169, November 2002.\n
292 http://grail.sdsc.edu/projects/vi_itr/grid02.pdf
294 - <b>Applying Scheduling and Tuning to On-line Parallel Tomography </b>\n
295 by <em>Shava Smallen, Henri Casanova, Francine Berman</em>\n
296 in Proceedings of Supercomputing 2001\n
297 http://grail.sdsc.edu/papers/tomo_journal.ps.gz
299 - <b>Heuristics for Scheduling Parameter Sweep applications in Grid environments</b>\n
300 by <em>Henri Casanova, Arnaud Legrand, Dmitrii Zagorodnov and Francine Berman</em>\n
301 in Proceedings of the 9th Heterogeneous Computing workshop (HCW'2000), pp349-363.\n
302 http://navet.ics.hawaii.edu/~casanova/homepage/papers/hcw00_pst.pdf
307 \li <b>Optimal algorithms for scheduling divisible workloads on
308 heterogeneous systems</b>\n
309 by <em>Olivier Beaumont and Arnaud Legrand and Yves Robert</em>\n
310 in Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS'03).\n
311 Preliminary version on ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/RR/RR2002/RR2002-36.ps.gz
314 \li <b>On-line Parallel Tomography</b>\n
315 by <em>Shava Smallen</em>\n
316 Masters Thesis, UCSD, May 2001