docs/source/Design_goals.rst

   1 .. _design:
   2
   3 SimGrid's Design Goals
   4 ######################
   5
   6 Any SimGrid simulation comes down to a set of **actors** using some
   7 **resources** through **activities**. SimGrid provides several kinds of
   8 resources (link, CPU, disk, and synchronization objects such as mutex
   9 or semaphores) along with the corresponding activity kinds
  10 (communication, execution, I/O, lock). SimGrid users provide a
  11 platform instantiation (list of interconnected resources) and an
  12 application (the code executed by actors) along with the actors'
  13 placement on the platform.
  14
  15 The actors (ie, the user code) can only interact with the platform
  16 through activities, that are somewhat similar to synchronizations.
  17 Some are very natural (locking a mutex is a synchronization with the
  18 other actors using the same mutex) while others activities constitute
  19 more original synchronization: execution, communication, and I/O have a
  20 quantitative component that depends on the resources. But still,
  21 writing some data to disk is seen as a synchronization with the other
  22 actors using the same disk. When you lock a mutex, you can proceed
  23 only when that mutex gets unlocked by someone else. Similarly, when you
  24 do an I/O, you can proceed once the disk delivered enough performance
  25 to fulfill your demand (along with the concurrent demands of the other
  26 actors occurring at the same time). Communication activities have both a
  27 qualitative component (the actual communication starts only when both
  28 the sender and receiver are ready to proceed) and a quantitative
  29 component (consuming the communication power of the link resources).
  30
  31 The design of SimGrid is shaped by several design goals:
  32
  33  - **reproducibility**: re-executing the same simulation must lead to
  34    the exact same outcome, even if it runs on another computer or
  35    operating system. When possible, this should also be true when you
  36    use another version of SimGrid.
  37  - **sweet spot between accuracy and simulation speed**: running a given simulation should be as fast as possible but predict
  38    correct performance trends (or even provide accurate predictions when correctly calibrated).
  39  - **versatility**: ability to simulate many kinds of distributed systems
  40    and resource models. But the simulation should be parsimonious too,
  41    to not hinder the tool's usability. SimGrid tries to provide sane
  42    default settings along with the possibility to augment and modify
  43    the provided models and their default settings.
  44  - **scalability**: ability to deal with very large simulations. In the
  45    number of actors, in the size of the platform, in the number of
  46    events, or all together.
  47
  48 Actors and activities
  49 *********************
  50
  51 The first crux of the SimGrid design lays in the interaction between
  52 each actor and the activities. For the sake of reproducibility, the
  53 actors cannot interact directly with their environment: every
  54 modification request is serialized through a central component that
  55 processes them in a reproducible order. For the sake of speed, SimGrid
  56 is designed as an operating system: the actors issue **simcalls** to a
  57 simulation kernel that is called **maestro** (because it decides which
  58 actors can proceed and which ones must wait).
  59
  60 In practice, a SimGrid simulation is a suite of so-called **scheduling
  61 rounds**, during which all actors that are not currently blocked on a
  62 simcall get executed. For that, maestro passes the control flow to the
  63 code of each actor, that are written in either C++, C, Fortran or Python.
  64 The control flow then returns to the maestro when the actor
  65 blocks on its next blocking simcall. Note that the time it takes to
  66 execute the actor code has to be reported to the simulator using
  67 execution activities. SMPI programs are automatically benchmarked
  68 while these executions must be manually reported in S4U. The simulated
  69 time is discrete in SimGrid and only progresses between scheduling
  70 rounds, so all events occurring during a given scheduling round occur
  71 at the exact same simulated timestamp, even if the actors are usually
  72 executed sequentially on the real platform.
  73
  74 .. image:: img/design-scheduling-simulatedtime.svg
  75    :scale: 80%
  76    :align: center
  77
  78 To modify their environment, the actors issue either **immediate
  79 simcalls** that take no time in the simulation (e.g.: spawning another
  80 actor), or **blocking simcalls** that must wait for future events (e.g.:
  81 mutex locks require the mutex to be unlocked by its owner;
  82 communications wait for the network to provide enough communication
  83 performance to fulfill the demand). A given scheduling round is
  84 usually composed of several sub-scheduling rounds during which
  85 immediate simcalls are resolved. This ends when all actors are either
  86 terminated or within a blocking simcall. The simulation models are
  87 then used to compute the time at which the first next simcall
  88 terminates. The time is advanced to that point, and a new scheduling
  89 round begins with all actors that got unblocked at that timestamp.
  90
  91 Context switching between the actors and maestro is highly optimized
  92 for the sake of simulation performance. SimGrid provides several
  93 implementations of this mechanism, called **context factories**. These
  94 implementations fall into two categories: Preemptive contexts are
  95 based on standard system threads from the libstdc library.
  96 They are usually better supported by external
  97 debuggers and profiling tools, but less efficient. The most efficient
  98 factories use non-preemptive mechanisms, such as SysV's ucontexts,
  99 boost's context, or our own hand-tuned implementation, that is written
 100 in assembly language. This is possible because a given actor is never
 101 interrupted between consecutive simcalls in SimGrid.
 102
 103 .. image:: img/design-scheduling-wallclock.svg
 104    :scale: 80%
 105    :align: center
 106
 107 For the sake of performance, actors can be executed in parallel using several system threads which execute all actors in
 108 turn. But in our experience, this rarely leads to any performance improvement because most applications simulated on top of
 109 SimGrid are fine-grained: it's often not worth simulating actors in parallel because the amount of work of each actor is too
 110 small. This is because the users tend to abstract away any large computations to efficiently simulate the control flow of their
 111 application. In addition, parallel simulation puts unpleasant restrictions on the user code, that must be correctly isolated.
 112 For example, the existing SMPI implementation cannot be used in parallel yet.
 113
 114 .. image:: img/design-scheduling-parallel.svg
 115    :scale: 80%
 116    :align: center
 117
 118 Parsimonious model versatility
 119 ******************************
 120
 121 Another orthogonal crux of the SimGrid design is the parsimonious versatility in modeling. For that, we tend to unify all
 122 resource and activity kinds. As you have seen, we parallel the classical notion of **computational power** with the more
 123 original **communication power** and **I/O power**. Asynchronous executions are less common than the asynchronous communications
 124 that proliferate in MPI but they are still provided for sake of symmetry: they even prove useful to efficiently simulate thread
 125 pools. SimGrid also provides asynchronous mutex locks for symmetry. The notion of **pstate** was introduced to model the
 126 stepwise variation of computational speed depending on the DVFS, and was reused to model the bootup and shutdown phases of a
 127 CPU: the computational speed is 0 at these specific pstates. This pstate notion was extended to represent the fact that the
 128 bandwidth provided by a wifi link to a given station depends on its signal-noise ratio (SNR).
 129
 130 Further on this line, all provided resource models are very comparable internally. They :ref:`rely on linear inequation systems
 131 <models-lmm>`, stating for example that the sum of the computational power received by all computation activities located on a
 132 given CPU cannot overpass the computational power provided by this resource. This extends nicely to multi-resources activities
 133 such as communications using several links, and also to the so-called parallel tasks (abstract activities representing a
 134 parallel execution kernel consuming both the communication and computational power of a set of machines) or fluid I/O streams
 135 (abstract activities representing a data stream from disk to disk through the network). Specific coefficients are added to the
 136 linear system to reflect how the real resources are shared between concurrent usages. The resulting system is then solved using
 137 a max-min objective function that maximizes the minimum of all shares allocated to activities. Our experience shows that this
 138 approach can successfully be used for fast yet accurate simulations of complex phenomena, provided that the model's coefficients
 139 and constants are carefully :ref:`calibrated <models_calibration>`, i.e. tailored and instantiated to that phenomenon.
 140
 141 Model-checking
 142 **************
 143
 144 Even if it was not in the original goals of SimGrid, the framework now
 145 integrates a full-featured model-checker (dubbed MC or Mc SimGrid)
 146 that can exhaustively explore all execution paths that the application
 147 could experience. Conceptually, Mc SimGrid is built upon the ideas
 148 presented previously. Instead of using the resource models to compute
 149 the order simcall terminations, it explores every order that is
 150 causally possible. In a simulation entailing only three concurrent
 151 events (i.e., simcalls) A, B, and C, it will first explore the
 152 scenario where the activities order is ABC, then the ACB order, then
 153 BAC, then BCA, then CAB and finally CBA. Of course, the number of
 154 scenarios to explore grows exponentially with the number of simcalls
 155 in the simulation. Mc SimGrid leverages reduction techniques to avoid
 156 re-exploring equivalent traces.
 157
 158 In practice, Mc SimGrid can be used to verify classical `safety and
 159 liveness properties
 160 <https://en.wikipedia.org/wiki/Linear_time_property>`_, but also
 161 `communication determinism
 162 <https://hal.inria.fr/hal-01953167/document>`_, a property that allows
 163 more efficient solutions toward fault-tolerance. It can alleviate the
 164 state space explosion problem through `Dynamic Partial Ordering
 165 Reduction (DPOR)
 166 <https://en.wikipedia.org/wiki/Partial_order_reduction>`_ and `state
 167 equality <https://hal.inria.fr/hal-01900120/document>`_.
 168
 169 Mc SimGrid is more experimental than other parts of the framework, such as SMPI that can now be used to run many full-featured
 170 MPI codes out of the box, but it's constently improving.