Merge branch 'master' into klement

[simgrid.git] / docs / source / platform_howtos.rst
diff --git a/docs/source/platform_howtos.rst b/docs/source/platform_howtos.rst

index 1c4eaca2dd453f241a5b366ea041d72a9da52635..e7dc6206e98af5c8b2d64a8599baff0821347679 100644 (file)
--- a/docs/source/platform_howtos.rst
+++ b/docs/source/platform_howtos.rst
@@ -1,8 +1,6 @@
-.. _platform:
-
  .. raw:: html
  
-   <object id="TOC" data="graphical-toc.svg" width="100%" type="image/svg+xml"></object>
+   <object id="TOC" data="graphical-toc.svg" type="image/svg+xml"></object>
     <script>
     window.onload=function() { // Wait for the SVG to be loaded before changing it
       var elem=document.querySelector("#TOC").contentDocument.getElementById("PlatformBox")
@@ -13,7 +11,7 @@
     <br/>
  
  .. _howto:
-   
+
  Modeling Hints
  ##############
  
@@ -61,7 +59,7 @@ freely available, though.
  .. _howto_churn:
  
  Modeling Churn (e.g., in P2P)
-****************************
+*****************************
  
  One of the biggest challenges in P2P settings is to cope with the
  churn, meaning that resources keep appearing and disappearing. In
@@ -185,3 +183,122 @@ period and another one for the shutdown period.
  
  Of course, this is only one possible way to model these things. YMMV ;)
  
+.. _howto_parallel_links:
+
+Modeling parallel links
+***********************
+
+Most HPC topologies, such as fat-trees, allow parallel links (a 
+router A and a router B can be connected by more than one link).
+You might be tempted to model this configuration as follows :
+
+.. code-block:: xml
+
+    <router id="routerA"/>
+    <router id="routerB"/>
+
+    <link id="link1" bandwidth="10GBps" latency="2us"/>
+    <link id="link2" bandwidth="10GBps" latency="2us"/>
+
+    <route src="routerA" dst="routerB">
+        <link_ctn id="link1"/>
+    </route>
+    <route src="routerA" dst="routerB">
+        <link_ctn id="link2"/>
+    </route>
+
+But that will not work, since SimGrid doesn't allow several routes for 
+a single `{src ; dst}` pair. Instead, what you should do is :
+
+  - Use a single route with both links (so both will be traversed
+    each time a message is exchanged between router A and B)
+
+  - Double the bandwidth of one link, to model the total bandwidth of
+    both links used in parallel. This will make sure no combined 
+    communications between router A and B use more than the bandwidth 
+    of two links
+
+  - Assign the other link a `FATPIPE` sharing policy, which will allow 
+    several communications to use the full bandwidth of this link without
+    having to share it. This will model the fact that individual
+    communications can use at most this link's bandwidth
+
+  - Set the latency of one of the links to 0, so that latency is only 
+    accounted for once (since both link are traversed by each message)
+
+So the final platform for our example becomes :
+
+.. code-block:: xml
+
+    <router id="routerA"/>
+    <router id="routerB"/>
+
+    <!-- This link limits the total bandwidth of all parallel communications -->
+    <link id="link1" bandwidth="20GBps" latency="2us"/>
+
+    <!-- This link only limits the bandwidth of individual communications -->
+    <link id="link2" bandwidth="10GBps" latency="0us" sharing_policy="FATPIPE"/>
+
+    <!-- Each message traverses both links -->
+    <route src="routerA" dst="routerB">
+        <link_ctn id="link1"/>
+        <link_ctn id="link2"/>
+    </route>
+
+.. _understanding_lv08
+
+Understanding the default TCP model
+***********************************
+When simulating a data transfer between two hosts, you may be surprised
+by the obtained simulation time. Lets consider the following platform:
+
+.. code-block:: xml
+
+   <host id="A" speed="1Gf" />
+   <host id="B" speed="1Gf" />
+
+   <link id="link1" latency="10ms" bandwidth="1Mbps" />
+
+   <route src="A" dst="B">
+     <link_ctn id="link1" />
+   </route>
+
+If host `A` sends `100kB` (a hundred kilobytes) to host `B`, one could expect
+that this communication would take `0.81` seconds to complete according to a
+simple latency-plus-size-divided-by-bandwidth model (0.01 + 8e5/1e6 = 0.81).
+However, the default TCP model of SimGrid is a bit more complex than that. It
+accounts for three phenomena that directly impact the simulation time even
+on such a simple example:
+
+  - The size of a message at the application level (i.e., 100kB in this
+    example) is not the size that will actually be transferred over the
+    network. To mimic the fact that TCP and IP headers are added to each packet of
+    the original payload, the TCP model of SimGrid empirically considers that
+    `only 97% of the nominal bandwidth` are available. In other words, the
+    size of your message is increased by a few percents, whatever this size be.
+
+  - In the real world, the TCP protocol is not able to fully exploit the
+    bandwidth of a link from the emission of the first packet. To reflect this
+    `slow start` phenomenon, the latency declared in the platform file is
+    multiplied by `a factor of 13.01`. Here again, this is an empirically
+    determined value that may not correspond to every TCP implementations on
+    every networks. It can be tuned when more realistic simulated times for
+    short messages are needed though.
+
+  - When data is transferred from A to B, some TCP ACK messages travel in the
+    opposite direction. To reflect the impact of this `cross-traffic`, SimGrid
+    simulates a flow from B to A that represents an additional bandwidth
+    consumption of `0.05`. The route from B to A is implicitly declared in the
+    platform file and uses the same link `link1` as if the two hosts were
+    connected through a communication bus. The bandwidth share allocated to the
+    flow from A to B is then the available bandwidth of `link1` (i.e., 97% of
+    the nominal bandwidth of 1Mb/s) divided by 1.05 (i.e., the total consumption).
+    This feature, activated by default, can be disabled by adding the
+    `--cfg=network/crosstraffic:0` flag to command line.
+
+As a consequence, the time to transfer 100kB from A to B as simulated by the
+default TCP model of SimGrid is not 0.81 seconds but
+
+.. code-block:: python
+
+    0.01 * 13.01 + 800000 / ((0.97 * 1e6) / 1.05) =  0.996079 seconds.