3 SimGrid 4 Release Notes
4 =======================
6 This page gathers some release notes, to shed light on the recent and current development of SimGrid.
7 Version 3.12 was the last pure release of SimGrid 3 while all versions starting with v3.13 can be seen as usable pre-versions of SimGrid 4.
8 Ancient versions are not documented here (the project started in 1998), but in the ChangeLog file only.
10 Version 3.13 (Apr 27. 2016)
11 ---------------------------
13 The Half Release, a.k.a. the Zealous Easter Trim:
15 .. rst-class:: compact-list
17 * Remove half of the lines of code: v3.12 was 286k lines while v3.13 is only 142k lines. |br|
18 Experimental untested unused "features" removed, and several parts were rewritten from C to C++.
20 * Introducing v4 of the XML platform format (many long-due cleanups)
21 * MSG examples fully reorganized (in C and Java)
22 * The S4U interface is rising, toward SimGrid 4:|br| All host manipulations now done in S4U,
23 SimDag was mostly rewritten on top of S4U but MSG & SimDag interfaces mostly unchanged.
25 Version 3.14 (Dec 25. 2016)
26 ---------------------------
28 This version (aka, the Christmas Pi) mainly introduces internal reorganization on the way to the future SimGrid 4 version.
29 These changes should be transparent to the users of the MSG, SMPI and SimDag interfaces, even if some buggy features were reworked
30 while some new features were added.
32 Users are expected to switch to this new version at their own pace, as we do not have the manpower to do any bug fixing on the old releases.
33 This version was tested on several Linux distributions, Mac OSX, Windows (restricted to the Java bindings when not using the Ubuntu
34 subsystem of Win10), FreeBSD and NetBSD.
38 .. rst-class:: compact-list
40 * Documentation reorganized and improved
41 * S4U interface further rising, toward SimGrid 4: |br|
42 Routing code rewritten for readability; Virtual Machines almost turned into a plugin; MSG, SimDag, MPI interfaces mostly unchanged.
43 * The model-checker now works on NetBSD too.
45 Version 3.14.159 (Dec 29. 2016)
46 -------------------------------
48 The "Christmas Pi (better approximation)" release fixes various small glitches from the 3.14 version.
50 Version 3.15 (March 22 2017)
51 ----------------------------
53 The Spring Release: continuous integration servers become green.
55 .. rst-class:: compact-list
57 * S4U: progress, integrating more parts of SimDag; New examples.
58 * SMPI: Support MPI 2.2; Convert internals to C++ (TBC).
59 * Java: Massive memleaks and performance issues fixed.
60 * Plus the usual bug fixes, cleanups and documentation improvements.
62 Users are expected to switch to this new version at their own pace, as we do not have the manpower to do any bug fixing on the old releases.
63 This version was tested on several Linux distributions, Mac OSX, Windows (restricted to our Java bindings), FreeBSD and NetBSD.
64 None of our 600+ integration tests is known to fail on any of these.
66 Version 3.16 (June 22 2017)
67 ---------------------------
69 The Blooming Spring Release: developments are budding.
71 .. rst-class:: compact-list
73 * S4U: Progress; Activity refcounting is now automatic.
74 * XML: <AS> can now be named <zone> as they should.
75 * SMPI: Further performance improvements; RMA support.
76 * Cloud: Multi-core VMs (do not overcommit them yet)
77 * (+ bug fixes, cleanups and documentation improvements)
79 SMPI performance was further improved, using various OS-level magic to speed up our virtualization of the MPI code to be run. This allowed
80 Tom to run a simulation of HPL involving 10^6 processes... Wow. The dlopen privatization schema should even allows to run your ranks
81 in parallel (even if it's not well tested yet).
83 On the Cloud side, we implemented multi-core VMs, which naturally acts as containers of processes;
84 S4U, the future interface of SimGrid 4.0 also progressed quite a bit.
86 The Storage is currently cleaned up by Fred, and some API changes are to be expected. We are sorry but the existing API is so crappy that
87 nobody ever used it, I guess. If you need it, please speak up soon!
89 We renamed AS into NetZone in the XML files (but the old one still work so no rush to update your platform). Since our platforms are
90 hierarchical, it makes no sense to name our zones "Autonomous Systems". Some other documentation pages got updated and modified. If
91 you see a particularly old, misleading or otherwise ugly doc page, please tell us and we'll fix it. Even typo reports are welcome.
93 But most of the work on S4U is not directly visible from the user. We revamped the whole code of activities (comms, execs, mutex, etc) to
94 not mix the applicative logic (calling test() or wait()) with the object refcounting. Now, you can test your comm object as many time as
95 you want. This change was really intrusive in our internals, and we're not done with restabilizing every bits, but we're on it.
97 Still on the S4U front, we managed to remove a few more XBT modules. We prefer to use the std or boost libraries nowadays, and switching
98 away from the XBT module enable to reduce our maintenance burden. Be warned that XBT will not always remain included in SimGrid.
100 On the infrastructure side, we are trying to setup a regular build task for the main projects using SimGrid, to check that our changes
101 don't break them. The one of StarPU is close to be working (even if not completely). If you want to have your own code tested regularly
102 against the SimGrid git to warn us about breakage that we introduce, please come to us. We can grant you the right to do the needed config
103 in our Jenkins instance.
105 v3.16 also contains the usual bug fixes, such as the jarfile that should now work on Mac OSX (this time for real :) or the Java bindings
106 that should now be clear of any memory leak.
108 In addition, the future already started. We have ongoing changesets that were not ready for 3.16 but should be part of 3.17:
110 .. rst-class:: compact-list
112 - Energy modeling for the network too
113 - New reduction algorithm for the model-checker, based on event folding structures
114 - Multi-model simulations, to specify a differing networking model for each netzone.
116 Version 3.17 (Oct 8. 2017)
117 --------------------------
119 This version is dubbed the "The Drained Leaks release", because almost no known memleak remains, despite testing.
121 .. rst-class:: compact-list
123 * Many many internal cleanups (almost 700 commits since 3.16).
124 * The coverage of our tests is above 80%.
125 * All memleaks but one plugged; A dozen of bugs fixed.
126 * XBT: Further replace XBT with std::* constructs.
128 Version 3.18 (Dec. 24 2017)
129 ---------------------------
131 This is an important version for SimGrid: MSG is now deprecated, and new projects should use S4U instead.
132 There is still some work to do before SimGrid 4: S4U is not ready for SimDag users yet unfortunately. This will come for sure.
134 Main changes in the "Ho Ho Ho! SimGrid 4 beta is coming to town" release:
136 .. rst-class:: compact-list
138 * Convert almost all interesting MSG examples to S4U.
139 * New model: energy consumption due to the network.
140 * Major cleanups in the disk and storage subsystems.
141 * (+ further deprecate XBT, bug fixes and doc improvement)
143 SimGrid 4 *may* be there by the next solstice.
145 Version 3.19 (March 21. 2018)
146 -----------------------------
148 In total, this "Moscovitly-cold Spring" release brings more than 500 commits made by 7 individuals over the last 3 months.
150 .. rst-class:: compact-list
152 * SMPI: Allow to start new actors and ranks after simulation start.
153 * SMPI: Support ICC, better testing on classical proxy apps.
154 * Some kernel headers are now installed, allowing external plugins.
155 * (+ the classical bug fixes and doc improvement)
157 Version 3.19.1 (March 22. 2018)
158 -------------------------------
160 As you may know, we are currently refactoring SimGrid in deep.
161 Upcoming SimGrid4 will be really different from SimGrid3: modular, standard and extensible vs. layered, homegrown and rigid. C++ vs. C.
163 Our goal is to smooth this transition, with backward compatibility and automatic update paths, while still progressing toward SimGrid4.
165 SimGrid remains open during works: The last pure SimGrid3 release was v3.12 while all subsequent versions are usable alpha versions of
166 SimGrid4: Existing interfaces remain unchanged, but the new S4U interface is budding and the internals are deeply reorganized.
168 Since 2015, we work hard to reduce the changes to public APIs. When we need to rename a public library symbol in S4U, we let your compiler
169 issue an explicit warning when you use the deprecated function. These messages remain for four releases, i.e. for one full year,
170 before turning into an error. Starting with v3.15, your can also adapt to API changes with the SIMGRID_VERSION macro, that is defined to
171 31500 for v3.15, to 31901 for v3.19.1 and so on.
173 Starting with this v3.19.1, our commitment to reduce the changes to the public interfaces is extended from the API to the ABI: a program
174 using only MSG or SimDag and compiled against a given version of SimGrid can probably be used with a later version of SimGrid without
175 recompilation. We will do our best... but don't expect too much of it, that's a really difficult goal during such profound refactoring.
177 The difference between v3.19 and v3.19.1 is that the former was accidentally breaking the ABI of MSG, while the later is restoring the
180 S4U and kernel APIs will still evolve until SimGrid4, with one-year deprecation warnings as currently. In fact, cleaning up these
181 interfaces and converting them to snake_case() is one release goal of v3.20. But don't worry, we are working to smooth this upgrade path.
183 Once the S4U interface stabilizes, we will provide C bindings on top of it, along with Java and Python ones. Maybe in 3.21 or 3.22.
185 All this is not contradictory with the fact that MSG as a whole is deprecated, because this deprecation only means that new projects
186 should go for S4U instead of MSG to benefit of the future. Despite this deprecation, old MSG projects should still be usable with no
187 change, if we manage to. This is a matter of scientific reproducibility to us.
189 Version 3.20 (June 25 2018)
190 ---------------------------
192 We were rather productive this season, with a total of 837 commits made by 8 individuals over the last 3 months.
194 The most visible change is the S4U API sanitizing. We were using an awful mix of snake_case and CamelCase, and we now use snake_case
195 everywhere. We apologize for the inconvenience, but it's for sake of sanity. Plus, we put portability wrappers in place: you don't have to
196 change your code until v3.24 if you can live with warnings. The MSG API was not changed, of course.
198 The robustness of SMPI continues to improve. It was rock stable, and you can now use it to move the world (if your lever is long enough).
199 We now use several full-scale projects as nightly integration tests: StarPU, BigDFT and also 45 Proxy Apps from various collections.
200 https://framagit.org/simgrid/SMPI-proxy-apps
202 Main changes in the "proxy snake_case()" release are:
204 .. rst-class:: compact-list
206 * Sanitize the public API. Compatibility wrappers in place for one year.
207 * More CI: ~45 Proxy Apps + BigDFT + StarPU now tested nightly
208 * MPI: Port the trace replay engine to C++, fix visualization
209 * (+ the classical bug fixes and doc improvement)
211 Version 3.21 (October 5. 2018)
212 ------------------------------
214 This release introduces a few nice features, but the most visible is certainly the new documentation. We started to completely overhaul it.
215 The result is still somewhat in progress, but we feel that it's much better already. We added a complete tutorial on S4U, we started a
216 tutorial on SMPI (still incomplete), we slightly improved the MSG and Java doc, and greatly improved the S4U doc. The section on writing
217 platform files is not converted in the new doc and you'll have to refer to the 3.20 documentation for that (sorry -- time went out).
219 Please give us feedback on this new doc. We want to make it as useful to you as possible, but it's very hard without (constructive) feedback
222 Another big change is that we are currently moving our development from github to framagit. We thought that framagit is a better place to
223 develop an Open Source project as ours. Head now to https://simgrid.org You can still use github if you prefer to use closed source code ;)
225 Main changes of The Restarting Documentation (TRD) release:
227 .. rst-class:: compact-list
229 * Start to overhaul the documentation, and move to Sphinx + RTD.
230 * Allow dynamic replay of MPI apps, controlled by S4U actors
231 * Rewrite the support for auto-restarted actors (was utterly broken)
232 * (+ the classical bug fixes and doc improvement)
234 Version 3.22 (April 2. 2019)
235 ----------------------------
237 The Easter Christmas Release. It was expected from Christmas, but I was so late that I even managed to miss the spring deadline.
238 This started to be a running joke, so I decided to release it for April 1. But I'm even late for this... Sorry :)
240 .. rst-class:: compact-list
242 * Introducing the Python bindings (still beta)
243 * Doc: SMPI tutorial and platform description ported to RTD
244 * Many internal cleanups leading to some user-level speedups
245 * (+ the classical bug fixes and internal refactorings)
247 The most visible change is certainly the new Python bindings. They are rather experimental yet, and their API may change a bit in future
248 release, but you are already welcome to test them. Many examples are now also available in Python, and the missing ones are on their way.
250 This new bindings further deprecates the old MSG and Java interfaces, which are still provided (and will remain so for a few years at least
251 for the existing users). Their examples are now hidden in deprecated/ Please switch to S4U if you like C++ or to Python if not.
253 This new version also introduce a heavy load of internal cleanups. Fred converted more internals to real C++, with more classes and less
254 procedural cruft. Henri and the other Wrench developers reported many bugs around activity canceling and resource failures, and we fixed
255 quite a bit of them, but many dark snakes remain in that lake. Fred and Martin converted more doc to the new system (the platform chapter
256 is not finished, but it's not worse than the old one either) while Augustin completed the tutorial for MPI applications. Augustin also
257 added several non-blocking collectives to SMPI, even if Martin finally decided to release right before he could complete the last ones
258 (sorry). We continued cutting on XBT, replacing many functions and modules by their standard counterparts in C++11 or in Boost. We are
259 now using Catch2 for our unit testing. These cleanups may speedup your simulations by something like 10%.
261 Version 3.23 (June 25. 2019)
262 ----------------------------
264 Main change in the "Exotic Solstice" Release:
266 .. rst-class:: compact-list
268 * Support for Solaris and Haiku OSes. Just for fun :)
269 * SMPI: more of MPI3.1; some MPI/IO and async collectives.
270 * Python bindings can now be installed from pip.
271 * (+ many many bug fixes and internal refactorings)
273 Version 3.24 (October 10. 2019)
274 -------------------------------
276 This is the Clean Disk Release:
278 .. rst-class:: compact-list
280 * Introduce an experimental Wifi network model.
281 * Introduce <disk> (cleaner logic than <storage>).
282 * SMPI: Implement Errhandlers and some more MPI3.1 calls.
283 * (+ many bug fixes and internal refactorings)
285 Since June, we continued our new release schema: v3.23.2 got released at some point as an interim release for people wanting something
286 between stable releases (tested on many systems but coming at most once per quarter) and git version (almost always working but you never
287 know). We plan to do so more often in the future, maybe with one interim version per month. Between interim versions, we use an odd
288 version number: v3.23.1 then 3.23.3 until yesterday, and soon 3.24.1.
290 As a user, there is no urgency to upgrade, even if you should not wait more than 9 months to upgrade to another stable version: our policy is
291 to keep backward compatibility and nice upgrading patches for 3 stable versions. v3.24 removes symbols that got deprecated in v3.20, last
292 year. It deprecates things that will continue to work until v3.27.
294 Speaking of deprecation, we would like to hear from you if you are using the Java bindings under Windows without the WSL installed.
295 Maintaining these native bindings are rather tedious, and we are wondering whether having Java+WSL would be sufficient.
297 In any case, please remember that we like to hear success stories, i.e. reports of the nice things you did with SimGrid. Not only bug
298 reports are welcome :)
300 Version 3.25 (Feb 2. 2020)
301 --------------------------
303 This is the "Palindrome Day" release (today is 02 02 2020).
305 .. rst-class:: compact-list
307 * Improve the Python usability (stability and documentation). |br|
308 A nasty synchronization bug (due to a bad handling of the GIL) was ironed out, so that no known bug remains in Python examples.
309 The Python documentation is now integrated with the C++ one, also along with the C bindings that were previously not documented.
310 The API documentation is now split by theme in the hope to keep it readable.
312 * Further deprecate MSG: you now have to pass -Denable_msg=ON to cmake. |br|
313 This is OFF by default (also disabling the Java API that is still based on MSG).
314 The plan is to completely remove MSG by 2020Q4 or 2021Q1.
316 * SimDAG++: Automatic dependencies on S4U activities (experimental). |br|
317 This implements some features of SimDAG within S4U, but not all of them: you cannot block an activity until it's scheduled on a resource
318 and there is no heterogeneous wait_any() that would mix Exec/Comm/Io activities. See ``examples/s4u/{io,exec,comm}-dependent`` for what's already there.
320 Since last fall, we continued to push toward the future SimGrid4 release. This requires to remove MSG and SimDAG once all users have
321 migrated to S4U. The two old interfaces are still here, but this release gives another gentle incentive toward the migration. You now
322 have to explicitly ask for MSG to be compiled in (and it may be removed by Q42020 or Q12021 along with the current Java bindings), and
323 this release proposes a nice S4U replacement for some parts of SimDAG.
325 Since last release also, we had no answer of potential users of the Java bindings on Windows without the WSL installed. We will probably
326 drop this architecture in the near future, then. Simplifying our settings is mandatory to continue to push SimGrid forward.
328 Version 3.26 (Dec 16. 2020)
329 ---------------------------
331 To celebrate the ease of the lockdown in France, we decided to bring another version of SimGrid to the world.
332 This is the "Release" release. Indeed a small contribution to the event, but this release was long overdue anyway.
334 .. rst-class:: compact-list
336 * SMPI: improved support of the proxy apps (including those using petsc)
337 * WiFi: easier description in XML; energy plugin; more examples.
338 * ns-3: Many bug fixes, can use the wifi models too.
339 * (+ many bug fixes, documentation improvement and internal refactoring)
341 Version 3.27 (March 29. 2021)
342 -----------------------------
344 To celebrate the 1176th anniversary of the siege of Paris by Vikings in 845, we just released another version of SimGrid, the Ragnar Release.
345 Yeah, that's a stupid release name, but we already had 4 "spring release" in the past, so we needed another name.
347 .. rst-class:: compact-list
349 * SMPI: can now report leaks and hint about the mallocs and kernels hindering simulation scalability.
350 * Doc: Several new sections in the user manual, and start documenting the internals.
351 * S4U: Direct comms from host to host, without mailboxes.
353 In some sense, these changes are just the tip of the iceberg, as we had many refactoring and internal cleanups in this release cycle too. Actually, we have 3
354 main ongoing refactoring that should bring us closer to SimGrid4, that will eventually occur.
356 The first change is dubbed SimDAG++. We want to make it possible to use S4U in the same spirit as SimDAG: centralized scheduling of tasks with dependencies. We
357 need to allow the maestro thread (the one that currently only call engine->run() in the main) to create asynchronous activities, chain them by declaring
358 dependencies, and run the simulation until some event of interest occurs.
360 Previous release introduced inter-activity dependency in s4u, this release introduces direct host-to-host communications (bypassing the mailboxes), but we
361 are not there yet: maestro cannot create asynchronous activities, and there is no way to run the simulation up to a certain point only.
363 The second ongoing refactoring is about the platform creation. Our goal is to provide a C++ API to create your platform from your code, without relying on
364 XML. There is a real possibility that this one will be part of the 3.28 release, in three months. Will see.
366 And the third front is about modernizing the model checker implementation. The current state is very difficult to work with, and we hope that once it's
367 simplified, we will be able to implement more efficient state space reduction techniques, and also allow more synchronization mechanism to be used in the
368 model checker (for now, our dpor algorithm cannot cope with mutexes).
370 In parallel to these refactoring, the work on SMPI stability and robustness peacefully continued. The list of MPI applications that can now work with
371 absolutely no change on top of SMPI really gets impressive... Check it yourself: https://framagit.org/simgrid/SMPI-proxy-apps
373 If you want to speak about it (or other SimGrid-related matter), please join us on Mattermost: https://framateam.org/simgrid/channels/town-square
374 Come! You don't even have to accept the cookies for that!
376 Version 3.28 (July 14. 2021)
377 ----------------------------
379 To celebrate the birthday of Crown Princess Victoria, we just released another version of SimGrid, the "Victoriadagarna" release.
381 .. rst-class:: compact-list
383 * Programmatic platform description (only C++ for now).
384 * New plugin to simplify producer/consumer applications.
385 * MC: new tutorial and associated docker image.
386 * SMPI: improve error handling for incorrect advanced usages.
387 * Many internal cleanups and refactoring to prepare the future.
389 As usual, even the full changelog is only the tip of the iceberg, given the amount of changes in the backstage.
391 This release is the big one for the programmatic platform descriptions, that are now fully usable from C++. XML will not
392 disappear anytime soon, but it is unlikely that we continue developing it in the future. When starting a new project, you should
393 probably go for the programmatic platforms. Or you could wait for the next release, where we hope to introduce the Python bindings of the
394 programmatic platforms. A platform in Python and an application in C++ may provide a better separation of concern (when it will be possible).
396 On the Model-Checking front, the code base did not evolve a lot, but we now provide a brand new tutorial and docker image for those wanting
397 to start using this feature. We are still not done with the refactoring required to unlock the future of Mc SimGrid and still
398 consider that it's less robust than the rest of SimGrid. We're working on it, and you may even find it useful as is anyway.
400 On the SimDag++ front (integrating the SimDAG interface to S4U), some work occurred in the backstage, but we were too busy with the
401 programmatic platforms to make this happen in this release. Maybe next season?
403 On the SMPI front, the work was on improving the usability. SMPI is now better at hinting the problem in buggy and almost-correct
404 applications, and it can assist the user in abstracting parts of the application to improve the simulation performance. Check the SMPI
405 tutorial for details.
407 Finally, we pursued our quest for a better codebase by following the hints of SonarCloud and other static analyzers. This is what it takes
408 to fight the technical debt and ensure that you'll still enjoy SimGrid in a decade. Along the same line, we removed the symbols that were
409 deprecated since 3 releases, as usual.
411 Version 3.29 (October 7. 2021)
412 ------------------------------
414 To celebrate the "Ask a stupid question" release, we wish that every user ask one question about SimGrid.
415 On `Mattermost <https://framateam.org/simgrid/channels/town-square>`_,
416 `Stack Overflow <https://stackoverflow.com/questions/tagged/simgrid>`_,
417 or using the `issues tracker <https://framagit.org/simgrid/simgrid/-/issues>`_.
419 .. rst-class:: compact-list
421 * Python bindings for the platform creation API
422 * Introduce non-linear resource sharing, allowing decay models
423 * New documentation section on realistic I/O modeling
424 * (+ many bug fixes and internal refactoring)
426 This release finishes the work on programmatic platforms, that was ongoing since 3.27. It is now possible to define a complete platform in either C++
427 or python, and the XML approach is now deprecated. It will probably remain around for a long time, but no evolution is planned. New features will not
428 be ported to the XML parser (unless you provide a patch, of course).
430 This release also paves the way for new models, with the introduction of two new features to the model solver:
432 .. rst-class:: compact-list
434 * Non-linear resource sharing was introduced, allowing to model resource whose performance heavily degrades with contention. This may be used in the
435 future for Wi-Fi links, where the total amount of data exchanged in a cell drops when the amount of stations reaches a threshold.
436 * Dynamic factors model variability in the speed of activities. This can be used to model an overhead (e.g., there is a 20 bytes header in a 480
437 bytes TCP packet so the factor 0.9583) but the novelty is this factor can now easily be adjusted depending on activity's and resources
438 characteristics. |br|
439 This existed for network (e.g., the effective bandwidth depends on the message in SMPI piecewise-linear network model) but it is now more general
440 (the factor may depend on the source and destination and thus account to different behaviors for intra-node communications and extra-node
441 communications) and is available for CPUs (e.g., if you want to model an affinity as in the "Unrelated Machines" problem in scheduling) and disks
442 (e.g., if you want to model a stochastic capacity) too. |br|
443 The same mechanism is also available for the latency, which allows to easily introduce complex variability patterns.
445 These new features are not used yet in the provided models, but this will probably change in future releases.
447 Version 3.30 (January 30. 2022)
448 -------------------------------
450 The Sunday Bloody Sunday release.
452 In may 2016, the future organization of the S4U activities was drafted on a Hawaiian whiteboard. We defined the life cycle of activities, their types,
453 and the way to combine them. All of this had been implemented since, but one piece was still missing: the capacity to express dependencies and vetoes
454 that can prevent an activity to start. The underlying idea was to be able to manage application DAGs, a la SimDag, through the S4U API, and have
455 maestro to handle the execution of such DAGs.
457 This release finishes this work, which is presented in a new set of examples (``examples/cpp/dag-*``). The direct consequences on the code base of this
460 * The SimDag API for the simulation of the scheduling of Directed Acyclic Graphs has been finally dropped. It was marked as deprecated for a couple
462 * The removal of SimDag led us to also remove the export to Jedule files that was tightly coupled to SimDag. The instrumentation of DAG simulation
463 is still possible through the regular instrumentation API based on the Paje format.
465 On the bindings front, we dropped the Lua bindings to create new platforms, as the C++ and Python interfaces are much better to that extend.
466 Also, the algorithm tutorial can now be taken in Python, for those of you allergic to C++.
468 Finally, on the SMPI front, we introduced a :ref:`new documentation section <models_calibration>` on calibrating the SMPI models from your
469 measurements and fixed some issues with the replay mechanism.
471 Version 3.31 (March 22. 2022)
472 -----------------------------
474 **On the model checking front**, the long awaited big bang finally occurred, greatly simplifying future evolution.
476 A formal verification with Mc SimGrid implies two processes: a verified application that is an almost regular SimGrid simulation, and a checker that
477 is an external process guiding the verified application to ensure that it explores every possible execution scenario. When formal verification was
478 initially introduced in SimGrid 15 years ago, both processes were intertwined in the same system process, but the mandated system tricks made it
479 impossible to use gdb or valgrind on that Frankenstein process. Having two heaps in one process is not usually supported.
481 The design was simplified in v3.12 (2015) by splitting the application and the checker in separate system processes. But both processes remained tightly
482 coupled: when the checker needed some information (such as the mailbox implied in a send operation, to compute whether this operation `commutes
483 with another one <https://en.wikipedia.org/wiki/Partial_order_reduction>`_), the checker was directly reading the memory of the other system process.
484 This was efficient and nice in C, but it prevented us from using C++ features such as opaque ``std::function`` data types. As such, it hindered the
485 ongoing SimDAG++ code reorganization toward SimGrid4, where all activity classes should be homogeneously written in modern C++.
487 This release introduces a new design, where the simcalls are given object-oriented ``Observers`` that can serialize the relevant information over the wire.
488 This information is used on the checker side to build ``Transition`` objects that the application simcalls. The checker code is now much simpler, as the
489 formal logic is not spoiled with system-level tricks to retrieve the needed information.
491 This cleaned design allowed us to finally implement the support for mutexes, semaphores and barriers in the model-checker (condition variables are still
492 missing). This enables in particular the verification of RMA primitives with Mc SimGrid, as their implementation in SMPI is based on mutexes and barriers.
493 Simix, a central element of the SimGrid 3 design, was also finally removed: the last bits are deprecated and will be removed in 3.35. We also replaced the
494 old, non-free ISP test suite by the one from the `MPI Bug Initiative <https://hal.archives-ouvertes.fr/hal-03474762>`_ (not all tests are activated yet).
495 This will eventually help improving the robustness of Mc SimGrid.
497 These changes unlock the future of Mc SimGrid. For the next releases, we plan to implement another exploration algorithm based on event unfoldings (using
498 `The Anh Pham's thesis <https://tel.archives-ouvertes.fr/tel-02462074>`_), the exploration of scenarios where the actors get killed and/or where
499 communications timeout, and the addition of a `wrapper to pthreads <https://hal.inria.fr/hal-02449080>`_, opening the path to the verification classical
500 multithreaded applications.
503 **On the model front,** we continued our quest for the modeling of parallel tasks (ptasks for short). Parallel tasks are intended to be an extension
504 of the max-min fairness model (that computes the sharing of communication flows or computation tasks) to tasks mixing resource kinds (e.g., a MPI
505 computationnal kernel with computations and communications, or a video stream with IO read, network transfer and decompression on the CPU). Just
506 specify the amount of computation for each involved host, the amount of data to transfer between each host pair, and you're set. The model will
507 identify bottleneck resources and fairly share them across activities within a ptask. From a user-level perspective, SimGrid handles ptasks just like
508 every other activity except that the usual SimGrid models (LV08 or SMPI) rely on an optimized algorithm that cannot handle ptasks. You must
509 activate :ref:`the L07 model <s4u_ex_ptasks>` on :ref:`the command line <options_model_select>`. This "model" remains a sort of hack since its introduction 15 years ago, as
510 it has never been well defined. We never succeded to unify L07 and max-min based models: Fairness is still to be defined in this context that mixes
511 flops and communicated bytes. The resulting activity rates are then specific to ptasks. Furthermore, unlike our network models, this model were not
512 thoroughly validated with respect to real experiments before `the thesis of Adrien Faure <https://tel.archives-ouvertes.fr/tel-03155702>`_ (and the
513 outcome was quite disappointing). Recent articles by Bonald and Roberts `properly define <https://hal.inria.fr/hal-01243985>`_ the allocation
514 objective we had in mind (under the name Bounded MaxMin Fairness -- BMF) and `study the convergence <https://hal.archives-ouvertes.fr/hal-01552739>`_
515 of the microscopic dynamic model to a macroscopic equilibrium, but this convergence could only be proved in rather simple cases. Even worse, there is
516 no known algorithm to efficiently compute a BMF!
518 L07 should still be avoided as we have exhibited simple scenarios where its solution is irrelevant to the BMF one (that is mathematically sound). This
519 release thus introduces a new BMF model to finally unify both classical and parallel tasks, but this is still ongoing work. The implemented
520 heuristic works very well for most SimGrid tests, but we have found some (not so prevalent) corner cases where our code fails to solve the sharing
521 problem in over 10 minutes... So this all should still be considered an ongoing research effort. We expect to have a better understanding of this issue
524 On a related topic, this release introduces :cpp:func:`simgrid::s4u::this_actor::thread_execute`, which allows creating a computation that comprises
525 several threads, and thus capable of utilizing more cores than a classical :cpp:func:`simgrid::s4u::this_actor::execute` action. The goal is to make
526 it straightforward to model multithreaded computational kernels, and it comes with an illustrating example. It can be seen as a simplified ptask, but
527 since it does not mix bytes and flops and has a homogeneous consumption over a single CPU, it perfectly fits with the classical SimGrid model.
529 This release also introduces steadily progress **on the bindings front**, introducing in particular the Mutex, Barrier and Semaphore to your python scripts.
531 Version 3.32 (October 3. 2022)
532 ------------------------------
534 The Wiedervereinigung release. Germany was reunited 32 years ago.
536 This release introduces tons of bugs fixes overall, and many small usability improvements contributed by the community.
538 **On the bindings front**, we further completed the Python bindings: the whole C++ API of Comms is now accessible (and exemplified) in Python, while a
539 few missing functions have been added to Engine and Mailboxes. It is also possible to manipulate ptasks from Python.
541 The Python platform generation has also been improved. In particular, user's errors should now raise an exception instead of killing the interpreter.
542 Various small improvements have been done to the graphicator tool so that you can now use jupyter to generate your platforms semi-interactively.
544 **On the model checking front**, we did many refactoring operations behind the scene (the deprecated ``mc::api`` namespace was for example emptied and removed),
545 but there are almost no user-level changes. The internal work is twofold.
547 First, we'd like to make optional all the complexity that liveness properties require to explore the application state (dwarf, libunwind, mmalloc,
548 etc) and instead only rely on fork to explore all the executions when liveness is not used. This would allow us to run the verified application under valgrind to
549 ease its debugging. Some progress was made towards that goal, but we are still rather far from this goal.
551 Second, we'd like to simplify the protocol between the model-checker and the application, to make it more robust and hopefully simplify the
552 model-checker code. After release v3.31, the model-checker can properly observe the simcall of a given actor through the protocol instead of reading
553 the application memory directly, but retrieving the list of actors still requires to read the remote memory, which in turn requires the aforementioned tricks on state
554 introspection that we are trying to remove. This goal is much harder to achieve than it may sound in the current code base, but we
555 note steady improvements in that direction.
557 In addition to these refactoring, this version introduces ``sthread``, a tool to intercept pthread operations at run time. The goal is to use it
558 together with the model-checker, but it's not working yet: we get a segfault during the initialization phase, and we failed to debug it so far. If
559 only we could use valgrind on the verified application, this would probably be much easier.
561 But we feel that it's probably better to not delay this release any further, as this tangled web will probably take time to get solved. So ``sthread``
562 is included in the source even if it's not usable in MC mode yet.
564 **On the interface front**, small API fixes and improvements have been done in S4U (in particular about virtual machines), while the support for MPI
565 IO has been improved in SMPI. We also hope that ``sthread`` will help simulating OpenMP applications at some point, but it's not usable for that either.
566 Hopefully in the next release.
568 Finally, this release mostly entails maintenance work **on the model front**: a bug was fixed when using ptasks on multicore hosts, and the legacy
569 stochastic generator of external load has been reintroduced.
571 Version 3.33 (never released)
572 -----------------------------
574 This version was overdue for more than 6 months, so it was skipped to not hinder our process of deprecating old code.
576 Version 3.34 (June 26. 2023)
577 ----------------------------
579 **On the maintenance front,** we removed the ancient MSG interface which end-of-life was scheduled for 2020, the Java bindings
580 that was MSG-only, support for native builds on Windows (WSL is now required) and support for 32 bits platforms. Keeping SimGrid
581 alive while adding new features require to remove old, unused stuff. The very rare users impacted by these removals are urged to
582 move to the new API and systems.
584 We also conducted many internal refactorings to remove any occurrence of "surf" and "simix". SimGrid v3.12 used a layered design
585 where simix was providing synchronizations to actors, on top of surf which was computing the models. These features are now
586 provided in modules, not layers. Surf became the kernel::{lmm, resource, routing, timer, xml} modules while simix became
587 the kernel::{activity, actor, context} modules.
589 **On the model front,** we realized an idea that has been on the back of our minds for quite some time. The question
590 was: could we use something in the line of the ptask model, that mixes computations and network transfers in a single
591 fluid activity, to simulate a *fluid I/O stream activity* that would consume both disk and network resources? This
592 remained an open question for years, mainly because the implementation of the ptask does not rely on the LMM solver as
593 the other models do. The *fair bottleneck* solver is convenient, but with less solid theoretical bases and the
594 development of its replacement (the *bmf solver*) is still ongoing. However, this combination of I/Os and
595 communications seemed easier as these activities share the same unit (bytes).
597 After a few tentatives, we opted for a simple, slightly imperfect, yet convenient way to implement such I/O streams at the
598 kernel level. It doesn't require a new model, just that the default HostModels implements a new function which creates a
599 classical NetworkAction, but add some I/O-related constraints to it. A couple little hacks here and there, and done! A single
600 activity mixing I/Os and communications can be created whose progress is limited by the resource (Disk or Link) of least
601 bandwidth value. As a result, a new :cpp:func:`Io::streamto()` function has been added to send data between arbitrary disks or
602 hosts. The user can specify a ``src_disk`` on a ``src_host`` and a ``dst_disk`` on a ``dst_host`` to stream data of a
603 given ``size``. Note that disks are optional, allowing users to simulate some kind of "disk-to-memory" or "memory-to-disk" I/O
604 streams. It's highly inspired by the existing :cpp:func:`Comm::sendto` that can be used to send data between arbitrary hosts.
606 We also modified the Wi-Fi model so that the total capacity of a link depends on the amount of flows on that link, accordingly to
607 the result of some ns-3 experiments. This model can be more accurate for congestioned Wi-Fi links, but its calibration is more
608 demanding, as shown in the `example
609 <https://framagit.org/simgrid/simgrid/tree/master/teshsuite/models/wifi_usage_decay/wifi_usage_decay.cpp>`_ and in the `research
610 paper <https://hal.archives-ouvertes.fr/hal-03777726>`_.
612 We also worked on the usability of our models, by actually writing the long overdue documentation of our TCP models and by renaming
613 some options for clarity (old names are still accepted as aliases). A new function ``s4u::Engine::flatify_platform()`` dumps an
614 XML representation that is inefficient (all zones are flatified) but easier to read (routes are explicitly defined). You should
615 not use the output as a regular input file, but it will prove useful to double-check the your platform.
617 **On the interface front**, some functions were deprecated and will be removed in 4 versions, while some old deprecated functions
618 were removed in this version, as usual.
620 Expressing your application as a DAG or a workflow is even more integrated than before. We added a new tutorial on simulating
621 DAGs and a DAG loader for workflows using the `wfcommons formalism <https://wfcommons.org/>`_. Starting an activity is now
622 properly delayed until after all its dependencies are fulfilled. We also added a notion of :ref:`Task <API_s4u_Tasks>`, a sort
623 of activity that can be fired several time. It's very useful to represent complex workflows. We added a ``on_this`` variant of
624 :ref:`every signal <s4u_API_signals>`, to react to the signals emitted by one object instance only. This is sometimes easier than
625 reacting to every signals of a class, and then filtering on the object you want. Activity signals (veto, suspend, resume,
626 completion) are now specialized by activity class. That is, callbacks registered in Exec::on_suspend_cb will not be fired for
629 Three new useful plugins were added: The :ref:`battery plugin<plugin_battery>` can be used to create batteries that get discharged
630 by the energy consumption of a given host, the :ref:`solar panel plugin <plugin_solar_panel>` can be used to create
631 solar panels which energy production depends on the solar irradiance and the :ref:`chiller plugin <plugin_chiller>` can be used to
632 create chillers and compensate the heat generated by hosts. These plugins could probably be better integrated
633 in the framework, but our goal is to include in SimGrid the building blocks upon which everybody would agree, while the model
634 elements that are more arguable are provided as plugins, in the hope that the users will carefully assess the plugins and adapt
635 them to their specific needs before usage. Here for example, there is several models of batteries (the one provided does not
636 take the aging into account), and would not be adapted to every studies.
638 It is now easy to mix S4U actors and SMPI applications, or even to start more than one MPI application in a given simulation
639 with the :ref:`SMPI_app_instance_start() <SMPI_mix_s4u>` function.
641 **On the model checking front**, this release brings a huge load of good improvements. First, we finished the long refactoring
642 so that the model-checker only reads the memory of the application for state equality (used for liveness checking) and for
643 :ref:`stateful checking <cfg=model-check/checkpoint>`. Instead, the network protocol is used to retrieve the information and the
644 application is simply forked to explore new execution branches. The code is now easier to read and to understand. Even better,
645 the verification of safety properties is now enabled by default on every platforms since it does not depend on advanced OS
646 mechanisms anymore. You can even run the verified application in valgrind in that case. On the other hand, liveness checking
647 still needs to be enabled at compile time if you need it. Tbh, this part of the framework is not very well maintained nowadays.
648 We should introduce more testing of the liveness verification at some point to fix this situation.
650 Back on to safety verification, we fixed a bug in the DPOR reduction which resulted in some failures to be missed by the
651 exploration, but this somewhat hinders the reduction quality (as we don't miss branches anymore). Some scenarios which could be
652 exhaustively explored earlier (with our buggy algorithm) are now too large for our (correct) exploration algorithm. But that's
653 not a problem because we implemented several mechanism to improve the performance of the verification. First, we implemented
654 source sets in DPOR, to blacklist transitions that are redundant with previously explored ones. Then, we implemented several new
655 DPOR variants. SDPOR and ODPOR are very efficient algorithms described in the paper "Source Sets: A Foundation for Optimal
656 Dynamic Partial Order Reduction" by Abdulla et al in 2017. We also have an experimental implementation of UPDOR, described in
657 the paper "Unfolding-based Partial Order Reduction" by Rodriguez et al in 2015, but it's not completely functional yet. We hope
658 to finish it for the next release. And finally, we implemented a guiding mechanism trying to converge faster toward the bugs in
659 the reduced state space. We have some naive heuristics, and we hope to provide better ones in the next release.
661 We also extended the sthread module, which allows to intercept simple code that use pthread mutex and semaphores to simulate and
662 verify it. You do not even need to recompile your code, as it uses LD_PRELOAD to intercept on the target functions. This module
663 is still rather young, but it could probably be useful already, e.g. to verify the code written by students in a class on UNIX
664 IPC and synchronization. Check `the examples <https://framagit.org/simgrid/simgrid/tree/master/examples/sthread>`_. In addition,
665 sthread can now also check concurrent accesses to a given collection, loosely inspired from `this paper
666 <https://www.microsoft.com/en-us/research/publication/efficient-and-scalable-thread-safety-violation-detection-finding-thousands-of-concurrency-bugs-during-testing>`_.
667 This feature is not very usable yet, as you have to manually annotate your code, but we hope to improve it in the near future.
671 **On the interface front**, we introduced a new MessageQueue abstraction and associated Mess simulated object. The behavior of a
672 MessageQueue is similar to that of a Mailbox, but intended for control messages that do not incur any simulated cost.
673 Information is automagically transported over thin air between producer and consumer. Internally, the implementation is very
674 similar to Mailboxes and Comms, only simpler. The motivation for this new abstraction came from a scalability issue observed in
675 the WRENCH framework, which is heavily based on control messages. When the simulated size of these messages is set to 0, it creates
676 very short lived network actions (i.e., lasting for only the route latency) that tend to overwhelm the LMM. Switching from Mailbox
677 to MessageQueue for such information exchange avoid this problem and greatly improves the scalability of WRENCH-based simulators.