2 @defgroup TRACE_API TRACING
3 @brief Gather data about your simulation for later analysis
5 SimGrid can trace the resource (of hosts and links) utilization using
6 any of its programming interfaces (S4U, SimDAG and SMPI). This means
7 that the tracing will register how much power is used for each host
8 and how much bandwidth is used for each link of the platform.
10 The idea of the tracing facilities is to give SimGrid users to
11 possibility to classify S4U and SimDAG tasks by category, tracing the
12 platform utilization (hosts and links) for each of the categories.
13 The API enables the declaration of categories and a function to
14 associate them to the tasks (S4U and SD). The tasks that are not
15 classified according to a category are not traced. If no categories
16 are specified, simulations can still be traced using a special
17 parameter in the command line (see @ref outcomes_vizu for details).
20 /*! @page outcomes_vizu Visualization and Statistical Analysis
22 SimGrid comes with an extensive support to trace and register what
23 happens during the simulation, so that it can be either visualized or
24 statistically analysed after the simulation.
26 This tracing is widely used to observe and understand the behavior of
27 parallel applications and distributed algorithms. Usually, this is
28 done in a two-step fashion: the user instruments the application and
29 the traces are analyzed after the end of the execution. The analysis
30 can highlights unexpected behaviors, bottlenecks and sometimes can be
31 used to correct distributed algorithms. The SimGrid team has
32 instrumented the library in order to let users trace their simulations
33 and analyze them. This part of the user manual explains how the
34 tracing-related features can be enabled and used during the
35 development of simulators using the SimGrid library.
37 @section instr_category_functions Tracing categories functions
39 The SimGrid library is instrumented so users can trace the platform
40 utilization using MSG, SimDAG and SMPI interfaces. It registers how
41 much power is used for each host and how much bandwidth is used for
42 each link of the platform. The idea with this type of tracing is to
43 observe the overall view of resources utilization in the first place,
44 especially the identification of bottlenecks, load-balancing among
47 Another possibility is to trace resource utilization by
48 categories. Categorized resource utilization tracing gives SimGrid
49 users to possibility to classify MSG and SimDAG tasks by category,
50 tracing resource utilization for each of the categories. The functions
51 below let the user declare a category and apply it to tasks. <em>The
52 tasks that are not classified according to a category are not
53 traced</em>. Even if the user does not specify any category, the
54 simulations can still be traced in terms of resource utilization by
55 using a special parameter that is detailed below (see section @ref
56 tracing_tracing_options).
58 @li @c TRACE_category(const char *category)
59 @li @c TRACE_category_with_color(const char *category, const char *color)
60 @li @c MSG_task_set_category(msg_task_t task, const char *category)
61 @li @c MSG_task_get_category(msg_task_t task)
62 @li @c SD_task_set_category(SD_task_t task, const char *category)
63 @li @c SD_task_get_category(SD_task_t task)
65 @section instr_mark_functions Tracing marks functions
66 @li @c TRACE_declare_mark(const char *mark_type)
67 @li @c TRACE_mark(const char *mark_type, const char *mark_value)
69 @section instr_uservariables_functions Tracing user variables functions
73 @li @c TRACE_host_variable_declare(const char *variable)
74 @li @c TRACE_host_variable_declare_with_color(const char *variable, const char *color)
75 @li @c TRACE_host_variable_set(const char *host, const char *variable, double value)
76 @li @c TRACE_host_variable_add(const char *host, const char *variable, double value)
77 @li @c TRACE_host_variable_sub(const char *host, const char *variable, double value)
78 @li @c TRACE_host_variable_set_with_time(double time, const char *host, const char *variable, double value)
79 @li @c TRACE_host_variable_add_with_time(double time, const char *host, const char *variable, double value)
80 @li @c TRACE_host_variable_sub_with_time(double time, const char *host, const char *variable, double value)
84 @li @c TRACE_link_variable_declare(const char *variable)
85 @li @c TRACE_link_variable_declare_with_color(const char *variable, const char *color)
86 @li @c TRACE_link_variable_set(const char *link, const char *variable, double value)
87 @li @c TRACE_link_variable_add(const char *link, const char *variable, double value)
88 @li @c TRACE_link_variable_sub(const char *link, const char *variable, double value)
89 @li @c TRACE_link_variable_set_with_time(double time, const char *link, const char *variable, double value)
90 @li @c TRACE_link_variable_add_with_time(double time, const char *link, const char *variable, double value)
91 @li @c TRACE_link_variable_sub_with_time(double time, const char *link, const char *variable, double value)
93 For links, but use source and destination to get route:
95 @li @c TRACE_link_srcdst_variable_set(const char *src, const char *dst, const char *variable, double value)
96 @li @c TRACE_link_srcdst_variable_add(const char *src, const char *dst, const char *variable, double value)
97 @li @c TRACE_link_srcdst_variable_sub(const char *src, const char *dst, const char *variable, double value)
98 @li @c TRACE_link_srcdst_variable_set_with_time(double time, const char *src, const char *dst, const char *variable, double value)
99 @li @c TRACE_link_srcdst_variable_add_with_time(double time, const char *src, const char *dst, const char *variable, double value)
100 @li @c TRACE_link_srcdst_variable_sub_with_time(double time, const char *src, const char *dst, const char *variable, double value)
102 @section tracing_tracing_options Tracing configuration Options
104 To check which tracing options are available for your simulator, you
105 can just run it with the option @verbatim --help-tracing @endverbatim
106 to get a very detailed and updated explanation of each tracing
107 parameter. These are some of the options accepted by the tracing
108 system of SimGrid, you can use them by running your simulator with the
109 <b>--cfg=</b> switch:
114 Safe switch. It activates (or deactivates) the tracing system.
115 No other tracing options take effect if this one is not activated.
123 It activates the categorized resource utilization tracing. It should
124 be enabled if tracing categories are used by this simulator.
126 --cfg=tracing/categorized:yes
130 tracing/uncategorized
132 It activates the uncategorized resource utilization tracing. Use it if
133 this simulator do not use tracing categories and resource use have to be
136 --cfg=tracing/uncategorized:yes
142 A file with this name will be created to register the simulation. The file
143 is in the Paje format and can be analyzed using Paje visualization
144 tools. More information can be found in these webpages:
145 <a href="http://github.com/schnorr/pajeng/">http://github.com/schnorr/pajeng/</a>
147 --cfg=tracing/filename:mytracefile.trace
149 If you do not provide this parameter, the trace file will be named simgrid.trace.
154 This option only has effect if this simulator is SMPI-based. Traces the MPI
155 interface and generates a trace that can be analyzed using Gantt-like
156 visualizations. Every MPI function (implemented by SMPI) is transformed in a
157 state, and point-to-point communications can be analyzed with arrows.
159 --cfg=tracing/smpi:yes
165 This option only has effect if this simulator is SMPI-based. The processes
166 are grouped by the hosts where they were executed.
168 --cfg=tracing/smpi/group:yes
172 tracing/smpi/computing
174 This option only has effect if this simulator is SMPI-based. The parts external
175 to SMPI are also outputted to the trace. Provides better way to analyze the data automatically.
177 --cfg=tracing/smpi/computing:yes
181 tracing/smpi/internals
183 This option only has effect if this simulator is SMPI-based. Display internal communications
184 happening during a collective MPI call.
186 --cfg=tracing/smpi/internals:yes
190 tracing/smpi/display-sizes
192 This option only has effect if this simulator is SMPI-based. Display the sizes of the messages
193 exchanged in the trace, both in the links and on the states. For collective, size means the global size of data sent by the process in general.
195 --cfg=tracing/smpi/display-sizes:yes
199 tracing/smpi/sleeping
215 tracing/smpi/format/ti-one-file
233 This option traces the behavior of all categorized actors, grouping them by hosts. This option
234 can be used to track actor location if this simulator has actor migration.
236 --cfg=tracing/actor:yes
242 This option put some events in a time-ordered buffer using the
243 insertion sort algorithm. The process of acquiring and releasing
244 locks to access this buffer and the cost of the sorting algorithm
245 make this process slow. The simulator performance can be severely
246 impacted if this option is activated, but you are sure to get a trace
247 file with events sorted.
249 --cfg=tracing/buffer:yes
255 This option changes the way SimGrid register its platform on the trace
256 file. Normally, the tracing considers all routes (no matter their
257 size) on the platform file to re-create the resource topology. If this
258 option is activated, only the routes with one link are used to
259 register the topology within a netzone. Routes among netzones continue to be
262 --cfg=tracing/onelink-only:yes
274 tracing/disable-power
282 tracing/disable-destroy
284 Disable the destruction of containers at the end of simulation. This
285 can be used with simulators that have a different notion of time
286 (different from the simulated time).
288 --cfg=tracing/disable-destroy:yes
294 Some visualization tools are not able to parse correctly the Paje file format.
295 Use this option if you are using one of these tools to visualize the simulation
296 trace. Keep in mind that the trace might be incomplete, without all the
297 information that would be registered otherwise.
299 --cfg=tracing/basic:yes
305 Use this to add a comment line to the top of the trace file.
307 --cfg=tracing/comment:my_string
313 Use this to add the contents of a file to the top of the trace file as comment.
315 --cfg=tracing/comment-file:textual_file.txt
321 This option determines the precision of timings stored in the trace file. Make sure
322 you set @ref options_model_precision to at least the same value as this option! (Traces
323 cannot be more accurate than the simulation; they can be less accurate, though.)
325 The following example will give you a precision of E-10 in the trace file:
327 --cfg=tracing/precision:10
339 tracing/platform/topology
346 Please pass @verbatim --help-tracing @endverbatim to your simulator
347 for the updated list of tracing options.
349 @section tracing_tracing_example_parameters Case studies
351 Some scenarios that might help you decide which tracing options
352 you should use to analyze your simulator.
354 @li I want to trace the resource utilization of all hosts
355 and links of the platform, and my simulator <b>does not</b> use
356 the tracing API. For that, you can run a uncategorized trace
357 with the following parameters (it will work with <b>any</b> SimGrid
362 --cfg=tracing/uncategorized:yes @
363 --cfg=tracing/filename:mytracefile.trace @
366 @li I want to trace only a subset of my MSG (or SimDAG) tasks.
367 For that, you will need to create tracing categories using the
368 <b>TRACE_category (...)</b> function (as explained above),
369 and then classify your tasks to a previously declared category
370 using the <b>MSG_task_set_category (...)</b>
371 (or <b>SD_task_set_category (...)</b> for SimDAG tasks). After
372 recompiling, run your simulator with the following parameters:
376 --cfg=tracing/categorized:yes @
377 --cfg=tracing/filename:mytracefile.trace @
381 @section tracing_tracing_example Example of Instrumentation
383 A simplified example using the tracing mandatory functions.
386 int main (int argc, char **argv)
388 MSG_init (&argc, &argv);
390 //(... after deployment ...)
392 //note that category declaration must be called after MSG_create_environment
393 TRACE_category_with_color ("request", "1 0 0");
394 TRACE_category_with_color ("computation", "0.3 1 0.4");
395 TRACE_category ("finalize");
397 msg_task_t req1 = MSG_task_create("1st_request_task", 10, 10, NULL);
398 msg_task_t req2 = MSG_task_create("2nd_request_task", 10, 10, NULL);
399 msg_task_t req3 = MSG_task_create("3rd_request_task", 10, 10, NULL);
400 msg_task_t req4 = MSG_task_create("4th_request_task", 10, 10, NULL);
401 MSG_task_set_category (req1, "request");
402 MSG_task_set_category (req2, "request");
403 MSG_task_set_category (req3, "request");
404 MSG_task_set_category (req4, "request");
406 msg_task_t comp = MSG_task_create ("comp_task", 100, 100, NULL);
407 MSG_task_set_category (comp, "computation");
409 msg_task_t finalize = MSG_task_create ("finalize", 0, 0, NULL);
410 MSG_task_set_category (finalize, "finalize");
419 @section tracing_tracing_analyzing Analyzing SimGrid Simulation Traces
421 A SimGrid-based simulator, when executed with the correct parameters
422 (see above) creates a trace file in the Paje file format holding the
423 simulated behavior of the application or the platform. You have
424 several options to analyze this trace file:
426 - Dump its contents to a CSV-like format using `pj_dump` (see <a
427 href="https://github.com/schnorr/pajeng/wiki/pj_dump">PajeNG's wiki
428 on pj_dump</a> and more generally the <a
429 href="https://github.com/schnorr/pajeng/">PajeNG suite</a>) and use
430 gnuplot to plot resource usage, time spent on blocking/executing
431 functions, and so on. Filtering capabilities are at your hand by
432 doing `grep`, with the best regular expression you can provide, to
433 get only parts of the trace (for instance, only a subset of
434 resources or processes).
436 - Derive statistics from trace metrics (the ones built-in with any
437 SimGrid simulation, but also those metrics you injected in the trace
438 using the TRACE module) using the <a
439 href="http://www.r-project.org/">R project</a> and all its
440 modules. You can also combine R with <a
441 href="http://ggplot2.org/">ggplot2</a> to get a number of high
442 quality plots from your simulation metrics. You need to `pj_dump`
443 the contents of the SimGrid trace file to use R.
445 - Visualize the behavior of your simulation using classic space/time
446 views (gantt-charts) provided by the <a
447 href="https://github.com/schnorr/pajeng/">PajeNG suite</a> and any
448 other tool that supports the <a
449 href="http://paje.sourceforge.net/download/publication/lang-paje.pdf">Paje
450 file format</a>. Consider this option if you need to understand the
451 causality of your distributed simulation.
453 - You can also check our online <a
454 href="https://simgrid.org/tutorials.html"> tutorial
455 section</a> that contains a dedicated tutorial with several
456 suggestions on how to use the tracing infrastructure. Look for the
457 SimGrid User::Visualization 101 tutorial.
459 - Ask for help on the <a
460 href="mailto:simgrid-community@inria.fr">simgrid-community@inria.fr</a>
461 mailing list, giving us a detailed explanation on what your
462 simulator does and what kind of information you want to trace. You
463 can also check the <a
464 href="http://lists.gforge.inria.fr/pipermail/simgrid-user/">mailing
465 list archive</a> for old messages regarding tracing and analysis.