1 /*! \page tracing Tracing Simulations for Visualization
4 The trace visualization is widely used to observe and understand the behavior
5 of parallel applications and distributed algorithms. Usually, this is done in a
6 two-step fashion: the user instruments the application and the traces are
7 analyzed after the end of the execution. The visualization itself can highlights
8 unexpected behaviors, bottlenecks and sometimes can be used to correct
9 distributed algorithms. The SimGrid team has instrumented the library
10 in order to let users trace their simulations and analyze them. This part of the
11 user manual explains how the tracing-related features can be enabled and used
12 during the development of simulators using the SimGrid library.
14 \section tracing_tracing_howitworks How it works
16 For now, the SimGrid library is instrumented so users can trace the <b>platform
17 utilization</b> using the MSG, SimDAG and SMPI interface. This means that the tracing will
18 register how much power is used for each host and how much bandwidth is used for
19 each link of the platform. The idea with this type of tracing is to observe the
20 overall view of resources utilization in the first place, especially the
21 identification of bottlenecks, load-balancing among hosts, and so on.
23 The idea of the tracing facilities is to give SimGrid users to possibility to
24 classify MSG and SimDAG tasks by category, tracing the platform utilization
25 (hosts and links) for each of the categories. For that,
26 the tracing interface enables the declaration of categories and a function to
27 mark a task with a previously declared category. <em>The tasks that are not
28 classified according to a category are not traced</em>. Even if the user
29 does not specify any category, the simulations can still be traced in terms
30 of resource utilization by using a special parameter that is detailed below.
32 \section tracing_tracing_enabling Enabling using CMake
34 With the sources of SimGrid, it is possible to enable the tracing
35 using the parameter <b>-Denable_tracing=ON</b> when the cmake is
36 executed. The sections \ref instr_category_functions, \ref
37 instr_mark_functions, and \ref instr_uservariables_functions describe
38 all the functions available when this Cmake options is
39 activated. These functions will have no effect if SimGrid is
40 configured without this option (they are wiped-out by the
44 $ cmake -Denable_tracing=ON .
48 \section instr_category_functions Tracing categories functions
49 \li \c TRACE_category(const char *category)
50 \li \c TRACE_category_with_color(const char *category, const char *color)
51 \li \c MSG_task_set_category(msg_task_t task, const char *category)
52 \li \c MSG_task_get_category(msg_task_t task)
53 \li \c SD_task_set_category(SD_task_t task, const char *category)
54 \li \c SD_task_get_category(SD_task_t task)
56 \section instr_mark_functions Tracing marks functions
57 \li \c TRACE_declare_mark(const char *mark_type)
58 \li \c TRACE_mark(const char *mark_type, const char *mark_value)
60 \section instr_uservariables_functions Tracing user variables functions
64 \li \c TRACE_host_variable_declare(const char *variable)
65 \li \c TRACE_host_variable_declare_with_color(const char *variable, const char *color)
66 \li \c TRACE_host_variable_set(const char *host, const char *variable, double value)
67 \li \c TRACE_host_variable_add(const char *host, const char *variable, double value)
68 \li \c TRACE_host_variable_sub(const char *host, const char *variable, double value)
69 \li \c TRACE_host_variable_set_with_time(double time, const char *host, const char *variable, double value)
70 \li \c TRACE_host_variable_add_with_time(double time, const char *host, const char *variable, double value)
71 \li \c TRACE_host_variable_sub_with_time(double time, const char *host, const char *variable, double value)
75 \li \c TRACE_link_variable_declare(const char *variable)
76 \li \c TRACE_link_variable_declare_with_color(const char *variable, const char *color)
77 \li \c TRACE_link_variable_set(const char *link, const char *variable, double value)
78 \li \c TRACE_link_variable_add(const char *link, const char *variable, double value)
79 \li \c TRACE_link_variable_sub(const char *link, const char *variable, double value)
80 \li \c TRACE_link_variable_set_with_time(double time, const char *link, const char *variable, double value)
81 \li \c TRACE_link_variable_add_with_time(double time, const char *link, const char *variable, double value)
82 \li \c TRACE_link_variable_sub_with_time(double time, const char *link, const char *variable, double value)
84 For links, but use source and destination to get route:
86 \li \c TRACE_link_srcdst_variable_set(const char *src, const char *dst, const char *variable, double value)
87 \li \c TRACE_link_srcdst_variable_add(const char *src, const char *dst, const char *variable, double value)
88 \li \c TRACE_link_srcdst_variable_sub(const char *src, const char *dst, const char *variable, double value)
89 \li \c TRACE_link_srcdst_variable_set_with_time(double time, const char *src, const char *dst, const char *variable, double value)
90 \li \c TRACE_link_srcdst_variable_add_with_time(double time, const char *src, const char *dst, const char *variable, double value)
91 \li \c TRACE_link_srcdst_variable_sub_with_time(double time, const char *src, const char *dst, const char *variable, double value)
93 \section tracing_tracing_options Tracing configuration Options
95 To check which tracing options are available for your simulator, you
96 can just run it with the option <b>--help-tracing</b>. These are the
97 options accepted by the tracing system of SimGrid as of today, you
98 can use them by running your simulator with the <b>--cfg=</b> switch:
103 Safe switch. It activates (or deactivates) the tracing system.
104 No other tracing options take effect if this one is not activated.
112 It activates the categorized resource utilization tracing. It should
113 be enabled if tracing categories are used by this simulator.
115 --cfg=tracing/categorized:1
119 tracing/uncategorized
121 It activates the uncategorized resource utilization tracing. Use it if
122 this simulator do not use tracing categories and resource use have to be
125 --cfg=tracing/uncategorized:1
131 A file with this name will be created to register the simulation. The file
132 is in the Paje format and can be analyzed using Triva or Paje visualization
133 tools. More information can be found in these webpages:
134 <a href="http://triva.gforge.inria.fr/">http://triva.gforge.inria.fr/</a>
135 <a href="http://paje.sourceforge.net/">http://paje.sourceforge.net/</a>
137 --cfg=tracing/filename:mytracefile.trace
139 If you do not provide this parameter, the trace file will be named simgrid.trace.
144 By default, the tracing system uses all routes in the platform file
145 to re-create a "graph" of the platform and register it in the trace file.
146 This option let the user tell the tracing system to use only the routes
147 that are composed with just one link.
149 --cfg=tracing/onelink_only:1
155 This option only has effect if this simulator is SMPI-based. Traces the MPI
156 interface and generates a trace that can be analyzed using Gantt-like
157 visualizations. Every MPI function (implemented by SMPI) is transformed in a
158 state, and point-to-point communications can be analyzed with arrows.
166 This option only has effect if this simulator is SMPI-based. The processes
167 are grouped by the hosts where they were executed.
169 --cfg=tracing/smpi/group:1
175 This option only has effect if this simulator is MSG-based. It traces the
176 behavior of all categorized MSG processes, grouping them by hosts. This option
177 can be used to track process location if this simulator has process migration.
179 --cfg=tracing/msg/process:1
185 This option put some events in a time-ordered buffer using the
186 insertion sort algorithm. The process of acquiring and releasing
187 locks to access this buffer and the cost of the sorting algorithm
188 make this process slow. The simulator performance can be severely
189 impacted if this option is activated, but you are sure to get a trace
190 file with events sorted.
192 --cfg=tracing/buffer:1
198 This option changes the way SimGrid register its platform on the trace
199 file. Normally, the tracing considers all routes (no matter their
200 size) on the platform file to re-create the resource topology. If this
201 option is activated, only the routes with one link are used to
202 register the topology within an AS. Routes among AS continue to be
205 --cfg=tracing/onelink_only:1
209 tracing/disable_destroy
211 Disable the destruction of containers at the end of simulation. This
212 can be used with simulators that have a different notion of time
213 (different from the simulated time).
215 --cfg=tracing/disable_destroy:1
221 Some visualization tools are not able to parse correctly the Paje file format.
222 Use this option if you are using one of these tools to visualize the simulation
223 trace. Keep in mind that the trace might be incomplete, without all the
224 information that would be registered otherwise.
226 --cfg=tracing/basic:1
232 Use this to add a comment line to the top of the trace file.
234 --cfg=tracing/comment:my_string
240 Use this to add the contents of a file to the top of the trace file as comment.
242 --cfg=tracing/comment_file:textual_file.txt
248 This option generates a graph configuration file for Triva considering
249 categorized resource utilization.
251 --cfg=triva/categorized:graph_categorized.plist
257 This option generates a graph configuration file for Triva considering
258 uncategorized resource utilization.
260 --cfg=triva/uncategorized:graph_uncategorized.plist
263 \section tracing_tracing_example_parameters Case studies
265 Some scenarios that might help you decide which tracing options
266 you should use to analyze your simulator.
268 \li I want to trace the resource utilization of all hosts
269 and links of the platform, and my simulator <b>does not</b> use
270 the tracing API. For that, you can run a uncategorized trace
271 with the following parameters (it will work with <b>any</b> Simgrid
276 --cfg=tracing/uncategorized:1 \
277 --cfg=tracing/filename:mytracefile.trace \
278 --cfg=triva/uncategorized:uncat.plist
281 \li I want to trace only a subset of my MSG (or SimDAG) tasks.
282 For that, you will need to create tracing categories using the
283 <b>TRACE_category (...)</b> function (as explained above),
284 and then classify your tasks to a previously declared category
285 using the <b>MSG_task_set_category (...)</b>
286 (or <b>SD_task_set_category (...)</b> for SimDAG tasks). After
287 recompiling, run your simulator with the following parameters:
291 --cfg=tracing/categorized:1 \
292 --cfg=tracing/filename:mytracefile.trace \
293 --cfg=triva/categorized:cat.plist
297 \section tracing_tracing_example Example of Instrumentation
299 A simplified example using the tracing mandatory functions.
302 int main (int argc, char **argv)
304 MSG_init (&argc, &argv);
306 //(... after deployment ...)
308 //note that category declaration must be called after MSG_create_environment
309 TRACE_category_with_color ("request", "1 0 0");
310 TRACE_category_with_color ("computation", "0.3 1 0.4");
311 TRACE_category ("finalize");
313 msg_task_t req1 = MSG_task_create("1st_request_task", 10, 10, NULL);
314 msg_task_t req2 = MSG_task_create("2nd_request_task", 10, 10, NULL);
315 msg_task_t req3 = MSG_task_create("3rd_request_task", 10, 10, NULL);
316 msg_task_t req4 = MSG_task_create("4th_request_task", 10, 10, NULL);
317 MSG_task_set_category (req1, "request");
318 MSG_task_set_category (req2, "request");
319 MSG_task_set_category (req3, "request");
320 MSG_task_set_category (req4, "request");
322 msg_task_t comp = MSG_task_create ("comp_task", 100, 100, NULL);
323 MSG_task_set_category (comp, "computation");
325 msg_task_t finalize = MSG_task_create ("finalize", 0, 0, NULL);
326 MSG_task_set_category (finalize, "finalize");
335 \section tracing_tracing_analyzing Analyzing the SimGrid Traces
337 The SimGrid library, during an instrumented simulation, creates a trace file in
338 the Paje file format that contains the platform utilization for the simulation
339 that was executed. The visualization analysis of this file is performed with the
340 visualization tool <a href="http://triva.gforge.inria.fr">Triva</a>, with
341 special configurations tunned to SimGrid needs. This part of the documentation
342 explains how to configure and use Triva to analyse a SimGrid trace file.
344 - <b>Installing Triva</b>: the tool is available in the Inria's Forge,
345 at <a href="http://triva.gforge.inria.fr">http://triva.gforge.inria.fr</a>.
346 Use the following command to get the sources, and then check the file
347 <i>INSTALL</i>. This file contains instructions to install
348 the tool's dependencies in a Ubuntu/Debian Linux. The tool can also
349 be compiled in MacOSX natively, check <i>INSTALL.mac</i> file.
351 $ git clone git://scm.gforge.inria.fr/triva/triva.git
356 - <b>Executing Triva</b>: a binary called <i>Triva</i> is available after the
357 installation (you can execute it passing <em>--help</em> to check its
358 options). If the triva binary is not available after following the
359 installation instructions, you may want to execute the following command to
360 initialize the GNUstep environment variables. We strongly recommend that you
361 use the latest GNUstep packages, and not the packages available through apt-get
362 in Ubuntu/Debian packaging systems. If you install GNUstep using the latest
363 available packages, you can execute this command:
365 $ source /usr/GNUstep/System/Library/Makefiles/GNUstep.sh
367 You should be able to see this output after the installation of triva:
369 $ ./Triva.app/Triva --help
370 Usage: Triva [OPTIONS...] TRACE0 [TRACE1]
371 Trace Analysis through Visualization
374 --ti_frequency {double} Animation: frequency of updates
375 --ti_hide Hide the TimeInterval window
376 --ti_forward {double} Animation: value to move time-slice
377 --ti_apply Apply the configuration
378 --ti_update Update on slider change
379 --ti_animate Start animation
380 --ti_start {double} Start of time slice
381 --ti_size {double} Size of time slice
383 --comparison Compare Trace Files (Experimental)
384 --graph Configurable Graph
385 --list Print Trace Type Hierarchy
386 --hierarchy Export Trace Type Hierarchy (dot)
387 --stat Trace Statistics and Memory Utilization
388 --instances List All Trace Entities
389 --linkview Link View (Experimental)
390 --treemap Squarified Treemap
391 --merge Merge Trace Files (Experimental)
392 --check Check Trace File Integrity
394 --gc_conf {file} Graph Configuration in Property List Format
395 --gc_apply Apply the configuration
396 --gc_hide Hide the GraphConfiguration window
398 Triva expects that the user choose one of the available options
399 (currently <em>--graph</em> or <em>--treemap</em> for a visualization analysis)
400 and the trace file from the simulation.
402 - <b>Understanding Triva - time-slice</b>: the analysis of a trace file using
403 the tool always takes into account the concept of the <em>time-slice</em>.
404 This concept means that what is being visualized in the screen is always
405 calculated considering a specific time frame, with its beggining and end
406 timestamp. The time-slice is configured by the user and can be changed
407 dynamically through the window called <em>Time Interval</em> that is opened
408 whenever a trace file is being analyzed. The next figure depicts the time-slice
409 configuration window.
410 In the top of the window, in the space named <i>Trace Time</i>,
411 the two fields show the beggining of the trace (which usually starts in 0) and
412 the end (that depends on the time simulated by SimGrid). The middle of the
413 window, in the square named <i>Time Slice Configuration</i>, contains the
414 aspects related to the time-slice, including its <i>start</i> and its
415 <i>size</i>. The gray rectangle in the bottom of this part indicates the
416 <i>current time-slice</i> that is considered for the drawings. If the checkbox
417 <i>Update Drawings on Sliders Change</i> is not selected, the button
418 <i>Apply</i> must be clicked in order to inform triva that the
419 new time-slice must be considered. The bottom part of the window, in the space
420 indicated by the square <i>Time Slice Animation</i> can be used to advance
421 the time-frame automatically. The user configures the amount of time that the
422 time-frame will forward and how frequent this update will happen. Once this is
423 configured, the user clicks the <i>Play</i> button in order to see the dynamic
424 changes on the drawings.
427 <a href="triva-time_interval.png" border=0><img src="triva-time_interval.png" width="50%" border=0></a>
430 <b>Remarks:</b> when the trace has too many hosts or links, the computation to
431 take into account a new time-slice can be expensive. When this happens, the
432 <i>Frequency</i> parameter, but also updates caused by change on configurations
433 when the checkbox <i>Update Drawings on Sliders
434 Change</i> is selected will not be followed.
436 - <b>Understanding Triva - graph</b>: one possibility to analyze
437 SimGrid traces is to use Triva's graph view, using the
438 <em>--graph</em> parameter to activate this view, and
439 <em>--gc_conf</em> with a graph configuration to customize the graph
440 according to the traces. A valid graph configuration (we are using
442 href="http://en.wikipedia.org/wiki/Property_list">Property List
443 Format</a> to describe the configuration) can be created for any
444 SimGrid-based simulator using the
445 <em>--cfg=triva/uncategorized:graph_uncategorized.plist</em> or
446 <em>--cfg=triva/categorized:graph_categorized.plist</em> (if the
447 simulator defines resource utilization categories) when executing
450 <b>Basic SimGrid Configuration</b>: The basic description of the configuration
454 node = (LINK, HOST, );
455 edge = (HOST-LINK, LINK-HOST, LINK-LINK, );
458 The nodes of the graph will be created based on the <i>node</i>
459 parameter, which in this case is the different <em>"HOST"</em>s and
460 <em>"LINK"</em>s of the platform used to simulate. The <i>edge</i>
461 parameter indicates that the edges of the graph will be created based
462 on the <em>"HOST-LINK"</em>s, <em>"LINK-HOST"</em>s, and
463 <em>"LINK-LINK"</em>s of the platform. After the definition of these
464 two parameters, the configuration must detail how the nodes
465 (<em>HOST</em>s and <em>LINK</em>s) should be drawn.
467 For that, the configuration must have an entry for each of
468 the types used. For <em>HOST</em>, as basic configuration, we have:
474 values = (power_used);
478 The parameter <em>size</em> indicates which variable from the trace
479 file will be used to define the size of the node HOST in the
480 visualization. If the simulation was executed with availability
481 traces, the size of the nodes will be changed according to these
482 traces. The parameter <em>type</em> indicates which geometrical shape
483 will be used to represent HOST, and the <em>values</em> parameter
484 indicates which values from the trace will be used to fill the shape.
486 For <em>LINK</em> we have:
492 values = (bandwidth_used);
497 The same configuration parameters are used here: <em>type</em> (with a
498 rhombus), the <em>size</em> (whose value is from trace's bandwidth
499 variable) and the <em>values</em>.
501 <b>Customizing the Graph Representation</b>: triva is capable to handle
502 a customized graph representation based on the variables present in the trace
503 file. In the case of SimGrid, every time a category is created for tasks, two
504 variables in the trace file are defined: one to indicate node utilization (how
505 much power was used by that task category), and another to indicate link
506 utilization (how much bandwidth was used by that category). For instance, if the
507 user declares a category named <i>request</i>, there will be variables named
508 <b>p</b><i>request</i> and a <b>b</b><i>request</i> (<b>p</b> for power and
509 <b>b</b> for bandwidth). It is important to notice that the variable
510 <i>prequest</i> in this case is only available for HOST, and
511 <i>brequest</i> is only available for LINK. <b>Example</b>: suppose there are
512 two categories for tasks: request and compute. To create a customized graph
513 representation with a proportional separation of host and link utilization, use
514 as configuration for HOST and LINK this:
520 values = (prequest, pcomputation);
525 values = (brequest, bcomputation);
529 This configuration enables the analysis of resource utilization by MSG tasks,
530 and the identification of load-balancing issues, network bottlenecks, for
533 <b>The Graph Visualization</b>: The next figure shows a graph visualization of a
534 given time-slice of the masterslave_forwarder example (present in the SimGrid
535 sources). The red color indicates tasks from the <i>compute</i> category. This
536 visualization was generated with the following configuration:
540 node = (LINK, HOST, );
541 edge = (HOST-LINK, LINK-HOST, LINK-LINK, );
546 values = (pcompute, pfinalize);
551 values = (bcompute, bfinalize);
558 <a href="triva-graph_visualization.png" border=0><img src="triva-graph_visualization.png" width="50%" border=0></a>
562 - <b>Understading Triva - colors</b>: Colors are now registered in
563 trace files. See the tracing API to how to define them for your