Augustin Degomme [Tue, 8 Jun 2021 09:19:19 +0000 (09:19 +0000)]
Merge branch 'factor_in_actions' into 'master'
New implementation for bandwidth factors
See merge request simgrid/simgrid!64
Augustin Degomme [Tue, 8 Jun 2021 07:26:04 +0000 (09:26 +0200)]
reduce number of iterations to speedup rma test
Augustin Degomme [Mon, 7 Jun 2021 19:12:53 +0000 (21:12 +0200)]
protect type_creation routines against null pointers in output types.
Augustin Degomme [Mon, 7 Jun 2021 18:57:48 +0000 (20:57 +0200)]
catch if MPI_Win_fence was only called once (not enough) when MPI_Win_free is called.
Augustin Degomme [Mon, 7 Jun 2021 18:56:32 +0000 (20:56 +0200)]
better check for mpi_datatype_null
Augustin Degomme [Mon, 7 Jun 2021 15:12:45 +0000 (17:12 +0200)]
check that we are not using RMA-reserved MPI_Op in non-RMA calls.
Augustin Degomme [Mon, 7 Jun 2021 14:32:21 +0000 (16:32 +0200)]
get_accumulate: if MPI_NO_OP is specified, origin* inputs are irrelevant
+ activate test.
Bruno Donassolo [Mon, 7 Jun 2021 15:15:16 +0000 (17:15 +0200)]
Try to avoid another ifdef WIN32.
Use unique_ptr to manage handle. Thanks @agiersch
Bruno Donassolo [Mon, 7 Jun 2021 14:03:21 +0000 (16:03 +0200)]
As always, forgot windows build
Bruno Donassolo [Mon, 7 Jun 2021 10:49:10 +0000 (12:49 +0200)]
Move load_platf to EngineImpl
Keep platf lib opened until end of simulation in case of user is using
some network callback defined in it.
Bruno Donassolo [Mon, 7 Jun 2021 08:08:30 +0000 (10:08 +0200)]
Please sonar
Arnaud Giersch [Sun, 6 Jun 2021 20:40:42 +0000 (22:40 +0200)]
Destroy dead actors after mc::replay() is completed (fix memory leak).
Arnaud Giersch [Sun, 6 Jun 2021 10:03:26 +0000 (12:03 +0200)]
Correctly remember buffer between persistent communications.
Fixes lots of Petsc tests, especially vec/is/sf/tests/ex14.c.
The buffer was lost after the first communication, and no more data could
be transfered effectively.
Arnaud Giersch [Fri, 4 Jun 2021 19:47:14 +0000 (21:47 +0200)]
Call cleanup_attr<Comm> before marking Comm as deleted.
The MPI_Comm may be used by the attr cleanup callbacks.
Arnaud Giersch [Fri, 4 Jun 2021 19:34:02 +0000 (21:34 +0200)]
Remove a global variable, and use a static to remember if smpi_main is running.
The value of 'running_with_smpi_main' is effectively used later, when the
simgrid::config callback is executed.
Arnaud Giersch [Fri, 4 Jun 2021 15:45:07 +0000 (17:45 +0200)]
Restore public smpi_init_options().
It was wrongly removed in commit
6a046487fb Make smpi_switch_data_segment check if a switch is needed, and return true when it occurs.
Bruno Donassolo [Fri, 4 Jun 2021 14:37:13 +0000 (16:37 +0200)]
Fix test on MAC OS
Bruno Donassolo [Fri, 4 Jun 2021 12:17:04 +0000 (14:17 +0200)]
Try to fix test on CI
Bruno Donassolo [Thu, 3 Jun 2021 13:59:50 +0000 (15:59 +0200)]
An example with SMPI and CPP platform
Shows how to use smpirun to execute an application with the platform described in C++.
Bruno Donassolo [Thu, 3 Jun 2021 13:59:27 +0000 (15:59 +0200)]
Minor fix usage
Bruno Donassolo [Thu, 3 Jun 2021 12:40:31 +0000 (14:40 +0200)]
Adjust test
Empty hostfiles are now checked inside the C++ code, not in smpirun
Bruno Donassolo [Thu, 3 Jun 2021 12:40:06 +0000 (14:40 +0200)]
Adjust test
Nodes chosen to run the test arent the same anymore.
Bruno Donassolo [Thu, 3 Jun 2021 09:23:45 +0000 (11:23 +0200)]
Minor fix in test.
Just messages order has changed.
Bruno Donassolo [Wed, 2 Jun 2021 18:11:09 +0000 (20:11 +0200)]
Moving SMPI app deployment to C++ code
Enable the deployment of SMPI experiments with C++ platform description.
Move application deployment from smpirun to smpi_main function.
The smpirun script used to parse the platform XML to create an
application deployment. This isn't possible anymore since we may don't
have a platform XML anymore.
Move the necessary input to smpi_main through specific cfg variables:
- smpi/hostfile: host file
- smpi/replay: replay file
- smpi/np: number of processes
- smpi/map: mapping process/rank
This cfg isn't used by users, they are cached inside the smpirun script.
Bruno Donassolo [Fri, 23 Apr 2021 16:56:43 +0000 (18:56 +0200)]
New platform example: StarZone of StarZone
Re-implements the griffon.xml using the C++ interface.
Simplify the implementation of homogeneous clusters organized in
cabinets.
Add tests in teshsuite to use both files.
Bruno Donassolo [Fri, 16 Apr 2021 19:12:18 +0000 (21:12 +0200)]
Fix build mac/windows
Bruno Donassolo [Thu, 15 Apr 2021 12:42:50 +0000 (14:42 +0200)]
Change C++ platform example
Remove small_platform.cpp.
Add a more programmatic platform using the StarZone.
Bruno Donassolo [Wed, 14 Apr 2021 18:31:46 +0000 (20:31 +0200)]
Try to fix python build
Add dependency for dl library.
Bruno Donassolo [Mon, 15 Mar 2021 17:45:25 +0000 (18:45 +0100)]
Runs examples with C++ platform description
*Loading platform
- Generates library files for C++ platforms. They can be loaded by the
engine using the same load_platform method.
- The Engine::load_platform will verify if the extension is .so, it'll
open the file using dlopen and search for the load_platform symbol.
- The platform so must contain a load_platform function that will be
called by the engine to generate the platform properly.
*Implementing an example
- Added a CMakeLists.txt in examples/platform to generate the .so for
each example.
- Pass to the tesh files a new variable "libdir" containing the
directory where the libraries are located.
Arnaud Giersch [Fri, 4 Jun 2021 08:16:08 +0000 (10:16 +0200)]
[sonar] Replace redundant type with 'auto'.
Arnaud Giersch [Fri, 4 Jun 2021 07:40:39 +0000 (09:40 +0200)]
[sonar] Redundant parentheses.
Arnaud Giersch [Fri, 4 Jun 2021 07:38:35 +0000 (09:38 +0200)]
[sonar] Pointer-to-const.
Arnaud Giersch [Thu, 3 Jun 2021 15:15:07 +0000 (17:15 +0200)]
Fix build with enable_smpi=OFF.
Arnaud Giersch [Thu, 3 Jun 2021 14:57:51 +0000 (16:57 +0200)]
Fix include.
Arnaud Giersch [Thu, 3 Jun 2021 13:34:01 +0000 (15:34 +0200)]
Ensure correct ordering of the accumulate requests.
Arnaud Giersch [Thu, 3 Jun 2021 11:57:25 +0000 (13:57 +0200)]
Useless test; TODO--.
Arnaud Giersch [Thu, 3 Jun 2021 11:07:08 +0000 (13:07 +0200)]
Initialize mmap-privatized segments earlier (before main).
Sometimes we may want to initiailze a global before MPI_Init.
Arnaud Giersch [Thu, 3 Jun 2021 09:28:56 +0000 (11:28 +0200)]
Make smpi_switch_data_segment check if a switch is needed, and return true when it occurs.
Kill global SMPI_switch_data_segment.
Arnaud Giersch [Thu, 3 Jun 2021 07:38:57 +0000 (09:38 +0200)]
Use existing function (also empties requests_ after waitall).
Arnaud Giersch [Thu, 3 Jun 2021 07:37:39 +0000 (09:37 +0200)]
Improve debug messages and avoid calling finish_comms twice when for myself.
Arnaud Giersch [Thu, 3 Jun 2021 07:23:48 +0000 (09:23 +0200)]
Initialize variable.
Arnaud Giersch [Wed, 2 Jun 2021 15:38:03 +0000 (17:38 +0200)]
Use existing functions to finish comms (and fix Win::flush).
Arnaud Giersch [Wed, 2 Jun 2021 15:10:47 +0000 (17:10 +0200)]
Review usage of rank/rank_/rank() is smpi_win.
Arnaud Giersch [Wed, 2 Jun 2021 14:37:24 +0000 (16:37 +0200)]
Little simplifications in loops.
Arnaud Giersch [Wed, 2 Jun 2021 12:55:46 +0000 (14:55 +0200)]
Some int -> bool conversions (+ use of existing macro).
Arnaud Giersch [Wed, 2 Jun 2021 11:49:30 +0000 (13:49 +0200)]
Call rank() only once.
Arnaud Giersch [Wed, 2 Jun 2021 10:23:02 +0000 (12:23 +0200)]
Ooops, fmt is second arg.
Arnaud Giersch [Wed, 2 Jun 2021 09:09:14 +0000 (11:09 +0200)]
Prefer emplace_back.
Arnaud Giersch [Wed, 2 Jun 2021 09:05:48 +0000 (11:05 +0200)]
Get rid of "%s" in second argument of function xbt_str_parse_*.
Arnaud Giersch [Wed, 2 Jun 2021 08:45:13 +0000 (10:45 +0200)]
XBT_ATTRIB_PRINTF for vprintf-like functions.
Arnaud Giersch [Wed, 2 Jun 2021 08:15:28 +0000 (10:15 +0200)]
Define class SmpiBenchGuard, and use RAII to handle smpi_bench_end()/smpi_bench_begin().
Arnaud Giersch [Tue, 1 Jun 2021 20:45:17 +0000 (22:45 +0200)]
Add missing calls to smpi_bench_begin() on error paths.
Bruno Donassolo [Tue, 1 Jun 2021 15:48:47 +0000 (17:48 +0200)]
Cannot set split-duplex through s4u intf.
This makes sense only in XML where it properly creates the
link-up/link-down.
Bruno Donassolo [Tue, 1 Jun 2021 15:48:18 +0000 (17:48 +0200)]
Add fg#71 to changelog [ci-skip]
Bruno Donassolo [Tue, 1 Jun 2021 09:12:51 +0000 (11:12 +0200)]
Update ChangeLog
Bruno Donassolo [Mon, 31 May 2021 12:52:56 +0000 (14:52 +0200)]
Adjust timing of SMPI tests
For sure SMPI is the most impacted by the bandwidth factors changes.
It seems especially impacted when collective comms are involved.
The old version used 2 factors for SMPI comms:
1) network/bandwidth-factor was used to reduce the link capacity (e.g.
0.97*C for LV08)
2) smpi/bw-factor was used for each communication, limitating the flow
capacity.
Now, the code is simplified, each communication has only 1 bw-factor
that is applied after it's done, at the update remaining phase.
In most cases, only the bw-factor are applied now but after the comm is
done, not before
Bruno Donassolo [Mon, 31 May 2021 08:51:47 +0000 (10:51 +0200)]
Fix link-load test
In the old version, our links capacity were 0.97*C, now it's just C.
So, more bytes can be transmitted through the links
Bruno Donassolo [Mon, 31 May 2021 08:41:40 +0000 (10:41 +0200)]
Fix timing of Vivaldi/two_peers.xml tests
In this platform file, the communication is bounded by the big latency
between nodes (as consequence of the distance between vivaldi
coordinates).
Therefore, we now apply the bandwidth factor (0.97) on top of this
time, delaying a little the communications.
Bruno Donassolo [Mon, 31 May 2021 08:18:41 +0000 (10:18 +0200)]
Fix timing of Wi-Fi tests.
Their timing were calculated considering that no factor were applied in
Wi-Fi communications.
This isn't the case anymore, since by default, we would apply the 0.97
factor from LV08 to these communications.
Setting CM02 as base network model since it doesn't apply any bandwidth
factor.
Bruno Donassolo [Fri, 28 May 2021 10:08:26 +0000 (12:08 +0200)]
Adjust dynamic network-factors test.
Improve test, including no crosstraffic config.
Bruno Donassolo [Thu, 27 May 2021 14:57:57 +0000 (16:57 +0200)]
New implementation for bandwidth factors
Bandwidth factors are now implemented at the Action level, reducing the
speed that an action advances (e.g. the number of bytes transmitted).
In the past, the factor took place at the maxmin system, limiting the
amount of resources a communication could use.
For example, a bw factor of 0.97 (default for LV08 network model) was
reflected by reducing the link capacity to 0.97*C. So, a 100MBs link had
97MBs capacity in the maxmin system.
Now, a communication alone using this link may use the 100MBs, but after
one second, it'll transmit only 97MB of data.
NOTE: This change may impact the timing of your experiments.
Arnaud Giersch [Tue, 1 Jun 2021 13:51:01 +0000 (15:51 +0200)]
Missing include.
Arnaud Giersch [Tue, 1 Jun 2021 13:30:44 +0000 (15:30 +0200)]
Coding style: no global "using namespace".
Arnaud Giersch [Tue, 1 Jun 2021 13:26:10 +0000 (15:26 +0200)]
Prefer std algorithms.
Arnaud Giersch [Tue, 1 Jun 2021 12:53:37 +0000 (14:53 +0200)]
Invert tests to reduce depth of nesting.
Arnaud Giersch [Tue, 1 Jun 2021 12:50:07 +0000 (14:50 +0200)]
Useless tests for emptyness.
Arnaud Giersch [Tue, 1 Jun 2021 10:59:34 +0000 (12:59 +0200)]
Add attribute(printf) to xbt::string_printf.
Arnaud Giersch [Tue, 1 Jun 2021 07:38:57 +0000 (09:38 +0200)]
Pointer-to-const for Sonar.
Arnaud Giersch [Mon, 31 May 2021 21:20:24 +0000 (23:20 +0200)]
Set shared variable *before* mutex unlock.
Arnaud Giersch [Mon, 31 May 2021 21:01:42 +0000 (23:01 +0200)]
Die on unwanted function calls.
Arnaud Giersch [Mon, 31 May 2021 20:53:55 +0000 (22:53 +0200)]
Parameter 'assert' is a bit field.
Bruno Donassolo [Mon, 31 May 2021 17:22:28 +0000 (19:22 +0200)]
Fixes in UTs
Arnaud Giersch [Mon, 31 May 2021 13:40:40 +0000 (15:40 +0200)]
Handle case where different groups are given to MPI_Win_start and MPI_Win_post on a seame process.
Arnaud Giersch [Mon, 31 May 2021 13:14:10 +0000 (15:14 +0200)]
Handle duplicated datatypes within predefined MPI_Op.
Martin Quinson [Mon, 31 May 2021 10:33:06 +0000 (12:33 +0200)]
Docker: try to get apt update to run despite the cache
Arnaud Giersch [Mon, 31 May 2021 09:32:23 +0000 (11:32 +0200)]
A few more 'const'.
Arnaud Giersch [Mon, 31 May 2021 09:28:24 +0000 (11:28 +0200)]
Replace redundant type with 'auto'.
Arnaud Giersch [Mon, 31 May 2021 09:24:54 +0000 (11:24 +0200)]
Remove useless temporary shadowing outer variable.
Arnaud Giersch [Mon, 31 May 2021 08:22:02 +0000 (10:22 +0200)]
Add some 'const' qualifiers.
This started with NetPoint::get_englobing_zone() and propagated quickly...
Arnaud Giersch [Mon, 31 May 2021 07:44:22 +0000 (09:44 +0200)]
Unused exception parameter 'e'.
Arnaud Giersch [Mon, 31 May 2021 07:40:15 +0000 (09:40 +0200)]
Explicitly delete the copy constructor and copy assignment operator (enforce rule-of-five).
Arnaud Giersch [Sat, 29 May 2021 12:25:53 +0000 (14:25 +0200)]
No valgrind leak check for issue71.
Millian Poquet [Fri, 28 May 2021 21:03:01 +0000 (23:03 +0200)]
xbt_replay: rethrow exception instead of xbt_die
Bruno Donassolo [Fri, 28 May 2021 19:16:55 +0000 (21:16 +0200)]
Remove sg_bandwidth_factor from disks
No need for bw_factor here
Lucas M. Schnorr [Fri, 28 May 2021 19:10:50 +0000 (16:10 -0300)]
remove a space to force update
Bruno Donassolo [Fri, 28 May 2021 17:49:45 +0000 (19:49 +0200)]
Issue#71: add check in add_route for gw_src/gw_dst
When adding a NetZoneRoute, check if gw_src and gw_dst belongs to the
respective netzones.
We need to recursively search for the netpoint inside the netzone since
the netpoint can be from one of its children.
Arnaud Giersch [Fri, 28 May 2021 15:16:30 +0000 (17:16 +0200)]
Fix refcount for Datatype_contents.
Error seen with Petsc test: vec_is_sf_tutorials-ex3_basic_dupped.
Arnaud Giersch [Fri, 28 May 2021 09:52:30 +0000 (11:52 +0200)]
Allow null ranks for MPI_Group_incl when n == 0.
Lucas M. Schnorr [Fri, 28 May 2021 14:48:12 +0000 (11:48 -0300)]
update draw_gantt.R script to work with pajengr
todo
- need to test with a recent version of S4U trace
Lucas M. Schnorr [Fri, 28 May 2021 14:41:29 +0000 (11:41 -0300)]
update pajengr installation procedure in S4U tutorial
Lucas M. Schnorr [Fri, 28 May 2021 14:36:59 +0000 (11:36 -0300)]
replace r-cran-dplyr by r-cran-devtools
Lucas M. Schnorr [Fri, 28 May 2021 14:35:35 +0000 (11:35 -0300)]
update tuto-smpi Dockerfile to contain r-cran-tidyverse (replacing included R packages)
Lucas M. Schnorr [Fri, 28 May 2021 14:26:40 +0000 (11:26 -0300)]
add additional necessary packages for pajengr installation
Lucas M. Schnorr [Fri, 28 May 2021 13:56:14 +0000 (10:56 -0300)]
update pajengr R code and figure
details:
- note the addition of library(tidyverse)
Augustin Degomme [Thu, 27 May 2021 14:55:48 +0000 (16:55 +0200)]
use new option in random test.
Augustin Degomme [Thu, 27 May 2021 14:47:48 +0000 (16:47 +0200)]
document new option
Augustin Degomme [Thu, 27 May 2021 13:40:44 +0000 (15:40 +0200)]
Add flag to provide an optional barrier in MPI_Finalize.
This is meant to help for codes which can aggressively cleanup memory at finalization, while other processes still have to use it.
For example when attributes are attached to a local communicator, in SMPI this communicator may be cleaned up by another process in the end, and the attribute has to still be valid at this point.
This was an issue with PETSC, for example.
Arnaud Giersch [Thu, 27 May 2021 09:04:25 +0000 (11:04 +0200)]
Use std::string for xbt_parse_units.
Combine parameters 'entity_kind' and 'name'.
Also rename surf_parse_* to xbt_parse_*.
Arnaud Giersch [Thu, 27 May 2021 08:35:46 +0000 (10:35 +0200)]
Deprecate SIMIX_get_clock().
Use simgrid_get_clock() or Engine::get_clock().
Arnaud Giersch [Thu, 27 May 2021 07:52:59 +0000 (09:52 +0200)]
Keyval should always exist.
Also please sonar by reducing depth for nested blocks.