-.. _platform:
-
.. raw:: html
- <object id="TOC" data="graphical-toc.svg" width="100%" type="image/svg+xml"></object>
+ <object id="TOC" data="graphical-toc.svg" type="image/svg+xml"></object>
<script>
window.onload=function() { // Wait for the SVG to be loaded before changing it
var elem=document.querySelector("#TOC").contentDocument.getElementById("PlatformBox")
<br/>
.. _howto:
-
+
Modeling Hints
##############
.. _howto_churn:
Modeling Churn (e.g., in P2P)
-****************************
+*****************************
One of the biggest challenges in P2P settings is to cope with the
churn, meaning that resources keep appearing and disappearing. In
Of course, this is only one possible way to model these things. YMMV ;)
+.. _howto_parallel_links:
+
+Modeling parallel links
+***********************
+
+Most HPC topologies, such as fat-trees, allow parallel links (a
+router A and a router B can be connected by more than one link).
+You might be tempted to model this configuration as follows :
+
+.. code-block:: xml
+
+ <router id="routerA"/>
+ <router id="routerB"/>
+
+ <link id="link1" bandwidth="10GBps" latency="2us"/>
+ <link id="link2" bandwidth="10GBps" latency="2us"/>
+
+ <route src="routerA" dst="routerB">
+ <link_ctn id="link1"/>
+ </route>
+ <route src="routerA" dst="routerB">
+ <link_ctn id="link2"/>
+ </route>
+
+But that will not work, since SimGrid doesn't allow several routes for
+a single `{src ; dst}` pair. Instead, what you should do is :
+
+ - Use a single route with both links (so both will be traversed
+ each time a message is exchanged between router A and B)
+
+ - Double the bandwidth of one link, to model the total bandwidth of
+ both links used in parallel. This will make sure no combined
+ communications between router A and B use more than the bandwidth
+ of two links
+
+ - Assign the other link a `FATPIPE` sharing policy, which will allow
+ several communications to use the full bandwidth of this link without
+ having to share it. This will model the fact that individual
+ communications can use at most this link's bandwidth
+
+ - Set the latency of one of the links to 0, so that latency is only
+ accounted for once (since both link are traversed by each message)
+
+So the final platform for our example becomes :
+
+.. code-block:: xml
+
+ <router id="routerA"/>
+ <router id="routerB"/>
+
+ <!-- This link limits the total bandwidth of all parallel communications -->
+ <link id="link1" bandwidth="20GBps" latency="2us"/>
+
+ <!-- This link only limits the bandwidth of individual communications -->
+ <link id="link2" bandwidth="10GBps" latency="0us" sharing_policy="FATPIPE"/>
+
+ <!-- Each message traverses both links -->
+ <route src="routerA" dst="routerB">
+ <link_ctn id="link1"/>
+ <link_ctn id="link2"/>
+ </route>
+
+.. _understanding_lv08
+
+Understanding the default TCP model
+***********************************
+When simulating a data transfer between two hosts, you may be surprised
+by the obtained simulation time. Lets consider the following platform:
+
+.. code-block:: xml
+
+ <host id="A" speed="1Gf" />
+ <host id="B" speed="1Gf" />
+
+ <link id="link1" latency="10ms" bandwidth="1Mbps" />
+
+ <route src="A" dst="B">
+ <link_ctn id="link1" />
+ </route>
+
+If host `A` sends `100kB` (a hundred kilobytes) to host `B`, one could expect
+that this communication would take `0.81` seconds to complete according to a
+simple latency-plus-size-divided-by-bandwidth model (0.01 + 8e5/1e6 = 0.81).
+However, the default TCP model of SimGrid is a bit more complex than that. It
+accounts for three phenomena that directly impact the simulation time even
+on such a simple example:
+
+ - The size of a message at the application level (i.e., 100kB in this
+ example) is not the size that will actually be transferred over the
+ network. To mimic the fact that TCP and IP headers are added to each packet of
+ the original payload, the TCP model of SimGrid empirically considers that
+ `only 97% of the nominal bandwidth` are available. In other words, the
+ size of your message is increased by a few percents, whatever this size be.
+
+ - In the real world, the TCP protocol is not able to fully exploit the
+ bandwidth of a link from the emission of the first packet. To reflect this
+ `slow start` phenomenon, the latency declared in the platform file is
+ multiplied by `a factor of 13.01`. Here again, this is an empirically
+ determined value that may not correspond to every TCP implementations on
+ every networks. It can be tuned when more realistic simulated times for
+ short messages are needed though.
+
+ - When data is transferred from A to B, some TCP ACK messages travel in the
+ opposite direction. To reflect the impact of this `cross-traffic`, SimGrid
+ simulates a flow from B to A that represents an additional bandwidth
+ consumption of `0.05`. The route from B to A is implicitly declared in the
+ platform file and uses the same link `link1` as if the two hosts were
+ connected through a communication bus. The bandwidth share allocated to the
+ flow from A to B is then the available bandwidth of `link1` (i.e., 97% of
+ the nominal bandwidth of 1Mb/s) divided by 1.05 (i.e., the total consumption).
+ This feature, activated by default, can be disabled by adding the
+ `--cfg=network/crosstraffic:0` flag to command line.
+
+As a consequence, the time to transfer 100kB from A to B as simulated by the
+default TCP model of SimGrid is not 0.81 seconds but
+
+.. code-block:: python
+
+ 0.01 * 13.01 + 800000 / ((0.97 * 1e6) / 1.05) = 0.996079 seconds.