From: Martin Quinson Date: Wed, 5 May 2021 15:43:52 +0000 (+0200) Subject: Extend the description of the IB configuration X-Git-Tag: v3.28~325 X-Git-Url: http://bilbo.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/commitdiff_plain/37dc62cfad50da4014f99f18f8a8dc24c994fe00 Extend the description of the IB configuration --- diff --git a/docs/source/Configuring_SimGrid.rst b/docs/source/Configuring_SimGrid.rst index 08e9d63473..a5497a827d 100644 --- a/docs/source/Configuring_SimGrid.rst +++ b/docs/source/Configuring_SimGrid.rst @@ -347,7 +347,6 @@ and you should use the last one, which is the maximal size. cat /proc/sys/net/ipv4/tcp_rmem # gives the sender window cat /proc/sys/net/ipv4/tcp_wmem # gives the receiver window -.. _cfg=smpi/IB-penalty-factors: .. _cfg=network/bandwidth-factor: .. _cfg=network/latency-factor: .. _cfg=network/weight-S: @@ -370,15 +369,63 @@ exchange. By default SMPI uses factors computed on the Stampede Supercomputer at TACC, with optimal deployment of processes on nodes. Again, only hardcore experts should bother about this fact. -InfiniBand network behavior can be modeled through 3 parameters -``smpi/IB-penalty-factors:"βe;βs;γs"``, as explained in `this PhD -thesis -`_. .. todo:: This section should be rewritten, and actually explain the options network/bandwidth-factor, network/latency-factor, network/weight-S. +.. _cfg=smpi/IB-penalty-factors: + +Infiniband model +^^^^^^^^^^^^^^^^ + +InfiniBand network behavior can be modeled through 3 parameters +``smpi/IB-penalty-factors:"βe;βs;γs"``, as explained in `this PhD +thesis +`_ (in French) +or more concisely in `this paper `_, +even if that paper does only describe models for myrinet and ethernet. +You can see in Fig 2 some results for Infiniband, for example. This model +may be outdated by now for modern infiniband, anyway, so a new +validation would be good. + +The three paramaters are defined as follows: + +- βs: penalty factor for outgoing messages, computed by running a simple send to + two nodes and checking slowdown compared to a single send to one node, + dividing by 2 +- βe: penalty factor for ingoing messages, same computation method but with one + node receiving several messages +- γr: slowdown factor when communication buffer memory is saturated. It needs a + more complicated pattern to run in order to be computed (5.3 in the thesis, + page 107), and formula in the end is γr = time(c)/(3×βe×time(ref)), where + time(ref) is the time of a single comm with no contention). + +Once these values are computed, a penalty is assessed for each message (this is +the part implemented in the simulator) as shown page 106 of the thesis. Here is +a simple translation of this text. First, some notations: + +- ∆e(e) which corresponds to the incoming degree of node e, that is to say the number of communications having as destination node e. +- ∆s (s) which corresponds to the degree outgoing from node s, that is to say the number of communications sent by node s. +- Φ (e) which corresponds to the number of communications destined for the node e but coming from a different node. +- Ω (s, e) which corresponds to the number of messages coming from node s to node e. If node e only receives communications from different nodes then Φ (e) = ∆e (e). On the other hand if, for example, there are three messages coming from node s and going from node e then Φ (e) 6 = ∆e (e) and Ω (s, e) = 3 + +To determine the penalty for a communication, two values need to be calculated. First, the penalty caused by the conflict in transmission, noted ps. + + +- if ∆s (i) = 1 then ps = 1. +- if ∆s (i) ≥ 2 and ∆e (i) ≥ 3 then ps = ∆s (i) × βs × γr +- else, ps = ∆s (i) × βs + + +Then, the penalty caused by the conflict in reception (noted pe) should be computed as follows: + +- if ∆e (i) = 1 then pe = 1 +- else, pe = Φ (e) × βe × Ω (s, e) + +Finally, the penalty associated with the communication is: +p = max (ps ∈ s, pe) + .. _cfg=network/crosstraffic: Simulating Cross-Traffic @@ -1082,6 +1129,7 @@ https://framagit.org/simgrid/platform-calibration/ https://simgrid.org/contrib/smpi-saturation-doc.html .. _cfg=smpi/display-timing: + Reporting Simulation Time ......................... @@ -1099,8 +1147,9 @@ in application code and in SMPI internals, to provide hints about the need to use sampling to reduce simulation time. .. _cfg=smpi/display-allocs: + Reporting memory allocations -......................... +............................ **Option** ``smpi/display-allocs`` **Default:** 0 (false) @@ -1457,8 +1506,9 @@ Then, you can pass the option actually activate the huge page support in shared mallocs. .. _cfg=smpi/auto-shared-malloc-thresh: + Automatically share allocations -......................... +............................... **Option** ``smpi/auto-shared-malloc-thresh:`` **Default:** 0 (false) This value in bytes represents the size above which all allocations