Martin Quinson [Sat, 25 Mar 2023 11:17:28 +0000 (12:17 +0100)]
Invert another logic error: we need ptrace when we need mem_info, not the opposite
Martin Quinson [Sat, 25 Mar 2023 10:24:06 +0000 (11:24 +0100)]
This test is always false, as we asserted so just above
Martin Quinson [Sat, 25 Mar 2023 10:18:58 +0000 (11:18 +0100)]
Do not ask for memory info when restarting in refork mode
Martin Quinson [Fri, 24 Mar 2023 22:33:21 +0000 (23:33 +0100)]
Use a portable name for SIGABRT
Martin Quinson [Fri, 24 Mar 2023 22:32:59 +0000 (23:32 +0100)]
That test seems to pass nowadays
Martin Quinson [Fri, 24 Mar 2023 22:10:43 +0000 (23:10 +0100)]
Try to use the same test file for non-linux now that we don't use ptrace
Martin Quinson [Fri, 24 Mar 2023 22:06:32 +0000 (23:06 +0100)]
MC: disable personality() as it fails on CI and is not mandatory
Martin Quinson [Fri, 24 Mar 2023 21:06:47 +0000 (22:06 +0100)]
Revalidate tesh files now that safety checking is based on reforks
Martin Quinson [Fri, 24 Mar 2023 21:06:04 +0000 (22:06 +0100)]
Fix the refork feature by not ptracing App so that it dies properly
Martin Quinson [Fri, 24 Mar 2023 21:00:34 +0000 (22:00 +0100)]
More explicit error message
Martin Quinson [Thu, 23 Mar 2023 23:00:53 +0000 (00:00 +0100)]
Fix another sonar warning
Arnaud Giersch [Fri, 24 Mar 2023 13:54:41 +0000 (14:54 +0100)]
Delete redundant blank lines at the start of a code blocks (CodeFactor).
mlaurent [Fri, 24 Mar 2023 15:23:47 +0000 (16:23 +0100)]
Add copy constructor to state, so we can backtrack different ways
Arnaud Giersch [Fri, 24 Mar 2023 13:02:57 +0000 (14:02 +0100)]
Strengthen debug messages on channel send/recv.
Arnaud Giersch [Thu, 23 Mar 2023 11:10:25 +0000 (12:10 +0100)]
Simplify member initialization.
Arnaud Giersch [Mon, 20 Mar 2023 10:31:24 +0000 (11:31 +0100)]
Fix test: program needs exactly 2 processes.
Arnaud Giersch [Thu, 16 Mar 2023 16:28:51 +0000 (17:28 +0100)]
Reduce scope for variable.
Arnaud Giersch [Mon, 20 Feb 2023 18:24:23 +0000 (19:24 +0100)]
Simplify loop.
Arnaud Giersch [Mon, 20 Feb 2023 18:19:19 +0000 (19:19 +0100)]
Determine n_transitions on receiving side (and remove it from the message).
Arnaud Giersch [Mon, 20 Feb 2023 18:15:17 +0000 (19:15 +0100)]
There's no need to compute the total transition count anymore.
Arnaud Giersch [Mon, 20 Feb 2023 18:07:25 +0000 (19:07 +0100)]
Remove superfluous test, and reduce depth of nested statements.
Arnaud Giersch [Mon, 20 Feb 2023 18:07:07 +0000 (19:07 +0100)]
Merge loops and get rid of the "probes" temporary vector.
mlaurent [Fri, 24 Mar 2023 09:16:12 +0000 (10:16 +0100)]
Merge branch 'master' of https://framagit.org/simgrid/simgrid
Martin Quinson [Thu, 23 Mar 2023 21:38:38 +0000 (22:38 +0100)]
Dont use handle_waitpid after we killed the App, as this function may report this as an error
Martin Quinson [Thu, 23 Mar 2023 20:58:46 +0000 (21:58 +0100)]
Actually, read()=0 is not an issue in the AppSide
it simply means that the checker closed the socket on its side, so we
should quit ASAP without complaining.
mlaurent [Thu, 23 Mar 2023 08:58:49 +0000 (09:58 +0100)]
Merge branch 'master' of https://framagit.org/simgrid/simgrid
mlaurent [Thu, 23 Mar 2023 08:50:33 +0000 (09:50 +0100)]
try to fix stack handling
Martin Quinson [Wed, 22 Mar 2023 22:04:28 +0000 (23:04 +0100)]
Fix some easy sonar smells
const-ness, try_emplace, attribute noreturn, ...
The most important one is the TransitionObjectAccess one, where a
field in the subclass was hiding a field of the same name in the
superclass. Maybe the bug I was experiencing in that area was related.
Martin Quinson [Wed, 22 Mar 2023 21:35:16 +0000 (22:35 +0100)]
Fix two sonar bugs
- don't throw from the destructor
- don't slice objects (downcast objects instead of downcasting
references). I suspect that this one is a false positive but the
tests still pass this way so let's go.
Martin Quinson [Wed, 22 Mar 2023 20:57:01 +0000 (21:57 +0100)]
Change an example to take its platform file from the command line if provided
Martin Quinson [Wed, 22 Mar 2023 20:43:42 +0000 (20:43 +0000)]
Merge branch 'udpor-phase6' into 'master'
Phase 6 of UDPOR Integration: Add `K`-partial alternatives computation + clean up phase
See merge request simgrid/simgrid!139
Martin Quinson [Wed, 22 Mar 2023 08:17:12 +0000 (09:17 +0100)]
jenkins::Flags if you run the tests, be verbose on errors [no-ci]
Martin Quinson [Tue, 21 Mar 2023 20:48:55 +0000 (21:48 +0100)]
Make sure that the dtor of CheckerSide actually kills the application and waits for it
Martin Quinson [Tue, 21 Mar 2023 20:06:51 +0000 (21:06 +0100)]
Fix the liveness tests when the reforks are compiled in but not activated
mlaurent [Tue, 21 Mar 2023 10:55:16 +0000 (11:55 +0100)]
fix order of execute_next calls
Martin Quinson [Mon, 20 Mar 2023 22:44:59 +0000 (23:44 +0100)]
Manually handle the memory associated to the libevent events
The previous version with std::unique_ptr resulted in segfaults errors
reported by valgrind, and I fail to get it right.
Martin Quinson [Mon, 20 Mar 2023 22:27:39 +0000 (23:27 +0100)]
Do not initialize the App's memory introspection if it's not needed
Reforks are still not activated in this code, as the DFS constructor
pretends that it needs memory introspection when it does not. The
version activating reforks is currently commented here, if you want to
play with it.
Things seem more or less working with this change. Known issues:
- liveness checking is killed by a out-of-bounds access to a vector
while handling the property automaton. This is the case even when
reforks are not activated, making this change improper for the
master branch.
- The checker is not very good at killing the application in refork
mode, and many processes remain around until after they are
abandoned by their checker.
I'm not sure of whether they only consume memory or whether they
also burn the CPU in an active loop. In both cases, this is ...
suboptimal.
This point is OK when not activating reforks.
- valgrind reports some sort of double free on the libevent's events.
I fail to get the std::unique_ptr thing right. See next commit.
Martin Quinson [Mon, 20 Mar 2023 16:09:17 +0000 (17:09 +0100)]
Put everything in position to re-fork the verified App
If you pass "need_memory_introspection = false" to the Exploration
constructor, then the application is re-forked systematically instead
of taking snapshot that are then restored.
But it's still in progress, in the sense that the memory is still
introspected even if we don't need it. The network protocol still
needs to be changed so that the memory info are asked only if
"need_memory_introspection = true" and not otherwise.
For the time being, using reforks is very memory intensive for some
reason, and my computers gets to its knees when running the tests.
Until after the OOM killer saves me by cleaning stuff.
mlaurent [Mon, 20 Mar 2023 15:52:48 +0000 (16:52 +0100)]
Merge branch 'master' of https://framagit.org/simgrid/simgrid
mlaurent [Mon, 20 Mar 2023 15:52:22 +0000 (16:52 +0100)]
add wait guide and mofidication for the heuristic computation
Maxwell Pirtle [Mon, 20 Mar 2023 08:41:58 +0000 (09:41 +0100)]
Address minor comments in MR review
Maxwell Pirtle [Mon, 20 Mar 2023 08:38:12 +0000 (09:38 +0100)]
Remove empty Comb.cpp
Maxwell Pirtle [Mon, 20 Mar 2023 08:33:34 +0000 (09:33 +0100)]
Add documentation for Comb data structure
The Comb data structure lacked documentation
prior to this commit. Since it plays a pretty
important role in the computation of `K`-partial
alternatives, it was given a more extensive
explanation to ensure that it was used properly.
Martin Quinson [Sun, 19 Mar 2023 20:15:10 +0000 (21:15 +0100)]
cosmetics
Martin Quinson [Sun, 19 Mar 2023 20:07:00 +0000 (21:07 +0100)]
Move more of the CheckerSide creation logic to the object constructor
Martin Quinson [Sun, 19 Mar 2023 18:45:39 +0000 (19:45 +0100)]
MC: disable Address Space Layout Randomization in the application
This will allow to re-fork the application on restore without
invalidating all the metadata we accumulated in the previous
exploration traces.
Martin Quinson [Sun, 19 Mar 2023 15:18:05 +0000 (16:18 +0100)]
Differ the creation of the RemoteProcessMemory to when we have enough information
Martin Quinson [Sun, 19 Mar 2023 14:57:19 +0000 (15:57 +0100)]
Better responsabilities splitup between CheckerSide and RemoteProcessMemory
Martin Quinson [Sun, 19 Mar 2023 14:20:23 +0000 (15:20 +0100)]
Move methods not related to Memory out of RemoteProcessMemory
Now that ModelChecker is gone, it's time to move to the next step of cleanup.
The goal is that CheckerSide is in charge of the interaction with the
application process and RemoteProcessMemory is in charge of its memory.
Right now, RemoteProcessMemory does a bit more, as it stores the pid
and whether or not the application process is running.
This is bad because we want to make RemoteProcessMemory optional, only
used when we need to introspect the application memory (liveness
checking, non-progression checking, etc), so that we can run the app
in valgrind when we don't need to introspect its memory (safety
checking without non-progression checking).
I know I just moved this chunks of code from ModelChecker to
RemoteProcessMemory to now move it further, and I'm sorry for the
noise, but this code drives me nuts and I need to clean it step by step.
Martin Quinson [Sun, 19 Mar 2023 14:01:44 +0000 (15:01 +0100)]
Finally kill the now empty ModelChecker class
Martin Quinson [Sun, 19 Mar 2023 13:50:37 +0000 (14:50 +0100)]
Move the memory handling of RemoteProcessMemory singleton from ModelChecker to CheckerSide
Martin Quinson [Sun, 19 Mar 2023 13:29:05 +0000 (14:29 +0100)]
Move handle_message from ModelChecker to RemoteProcessMemory
Martin Quinson [Sun, 19 Mar 2023 12:54:23 +0000 (13:54 +0100)]
Move handle_waitpid from ModelChecker to RemoteProcessMemory
Martin Quinson [Sun, 19 Mar 2023 11:46:57 +0000 (12:46 +0100)]
Make a global singleton of Exploration, to kill ModelChecker
Having global singletons is far from optimal, but it's a bit like the
EngineImpl singleton in the model-checker process.
This will allow to kill the ModelChecker class which responsabilities
were split between RemoteApp and Exploration.
Martin Quinson [Sun, 19 Mar 2023 11:15:58 +0000 (12:15 +0100)]
Gosh, how many calls to that global were there?
Martin Quinson [Sun, 19 Mar 2023 11:09:07 +0000 (12:09 +0100)]
Kill a now unused class in mc
Martin Quinson [Sun, 19 Mar 2023 09:58:20 +0000 (10:58 +0100)]
Remove some more usage of mc_model_checker in Region and snapshoting logic
Martin Quinson [Sun, 19 Mar 2023 09:10:34 +0000 (10:10 +0100)]
Another use of mc_model_checker disapears. In Snapshot.equals.
I guess that this could be done in a better way, as noted in the
comment.
Martin Quinson [Sun, 19 Mar 2023 08:52:48 +0000 (09:52 +0100)]
another mc_model_checker call location disappears
I postponned this one a lot because it's impacting non-MC code, but at
the end it went smoothly
Martin Quinson [Sun, 19 Mar 2023 08:13:06 +0000 (09:13 +0100)]
Fix make distcheck
mlaurent [Sat, 18 Mar 2023 22:29:33 +0000 (23:29 +0100)]
Merge branch 'master' of https://framagit.org/simgrid/simgrid
mlaurent [Sat, 18 Mar 2023 22:29:03 +0000 (23:29 +0100)]
Bases for wait distance guide
Martin Quinson [Sat, 18 Mar 2023 21:49:00 +0000 (22:49 +0100)]
Fix MC+clang builds
Martin Quinson [Sat, 18 Mar 2023 20:49:57 +0000 (21:49 +0100)]
Reduce a bit the adherance of handle_waitpid to ModelChecker
Martin Quinson [Sat, 18 Mar 2023 21:21:14 +0000 (21:21 +0000)]
Merge branch 'master' into 'master'
First step for guided state
See merge request simgrid/simgrid!141
mlaurent [Sat, 18 Mar 2023 14:58:43 +0000 (15:58 +0100)]
Move DPOR and sleep set algorithm from backtrack to run procedure
mlaurent [Sat, 18 Mar 2023 13:48:00 +0000 (14:48 +0100)]
Merge branch 'master' of https://framagit.org/simgrid/simgrid
mlaurent [Sat, 18 Mar 2023 13:46:53 +0000 (14:46 +0100)]
Replace todo direct access with consider methods; guided or not
mlaurent [Sat, 18 Mar 2023 13:05:13 +0000 (14:05 +0100)]
BasicGuide handle next_transition if asked to
Martin Quinson [Sat, 18 Mar 2023 11:21:30 +0000 (12:21 +0100)]
Merge CheckerSide::start() intp the constructor
Martin Quinson [Sat, 18 Mar 2023 11:05:09 +0000 (12:05 +0100)]
Better split of responsabilities between CheckerSide and RemoteApp
Define in CheckerSide the callbacks that are used in there, instead of
defining it in the RemoteApp and passing it along to the CheckerSide.
Let's be optimistic: this code is every day a bit less messy.
Martin Quinson [Sat, 18 Mar 2023 10:58:55 +0000 (11:58 +0100)]
Move the checker_side_ from the ModelChecker to the RemoteApp
with an axe.
Martin Quinson [Sat, 18 Mar 2023 10:25:53 +0000 (11:25 +0100)]
One usage of mc_model_checker less
I had to reduce the const-ness of the RemoteProcessMemory variable in
Snapshot::operator==() because getting the heap modifies the
RemoteMemory object. Sorry sonar.
mlaurent [Sat, 18 Mar 2023 10:14:35 +0000 (11:14 +0100)]
Add GuidedState abstract class; move ActorState management
Martin Quinson [Fri, 17 Mar 2023 22:02:27 +0000 (23:02 +0100)]
Simplify Channel::receive by handling non-blocking recv separately
Martin Quinson [Fri, 17 Mar 2023 21:23:09 +0000 (22:23 +0100)]
A few calls to mc_model_checker less by passing more parameters
Martin Quinson [Fri, 17 Mar 2023 21:13:40 +0000 (22:13 +0100)]
Move handle_simcall from ModelChecker to RemoteApp
Martin Quinson [Fri, 17 Mar 2023 21:42:38 +0000 (21:42 +0000)]
Merge branch 'master' into 'master'
Add reference to parent state
See merge request simgrid/simgrid!140
mlaurent [Fri, 17 Mar 2023 15:55:22 +0000 (16:55 +0100)]
Add reference to parent state: only use this creation in DFSexplorer
Maxwell Pirtle [Fri, 17 Mar 2023 13:03:35 +0000 (14:03 +0100)]
Fix test in k-partial alternatives step five
The fifth step in the K-partial alternatives
actually has three possible alternatives that
can be selected instead of only one. UDPOR
would still pick e7 next regardless, but more
than one outcome is possible.
Arnaud Giersch [Thu, 16 Mar 2023 10:58:11 +0000 (11:58 +0100)]
Missing include.
Arnaud Giersch [Thu, 16 Mar 2023 10:46:41 +0000 (11:46 +0100)]
Decrease required version for nlohmann_json; add to jenkins/project_description.sh.
Version 3.7.0 is available in debian/buster-backports.
Maxwell Pirtle [Thu, 16 Mar 2023 10:36:57 +0000 (11:36 +0100)]
Remove unused code in Comb.cpp + fix MANIFEST.in
Maxwell Pirtle [Thu, 16 Mar 2023 09:56:36 +0000 (10:56 +0100)]
Add full example for K-partial alternatives
Arnaud Giersch [Wed, 15 Mar 2023 14:25:52 +0000 (15:25 +0100)]
Useless guards.
Arnaud Giersch [Wed, 15 Mar 2023 14:17:52 +0000 (15:17 +0100)]
Apply "smpi/buffering" when MC_record_replay_is_active too.
Martin Quinson [Wed, 15 Mar 2023 22:54:52 +0000 (23:54 +0100)]
Sanitize how we know the current MC mode
This can be either NONE, AppSide, CheckerSide or Replay.
Also, further reduce how often we use the mc_model_checker singleton
Martin Quinson [Wed, 15 Mar 2023 22:23:03 +0000 (23:23 +0100)]
Make it compile with all warnings enabled
Martin Quinson [Wed, 15 Mar 2023 21:55:45 +0000 (22:55 +0100)]
Document a future cleanup to do when we bump cmake version
Maxwell Pirtle [Wed, 15 Mar 2023 14:56:53 +0000 (15:56 +0100)]
Move alternative computation to Configuration for testing
Maxwell Pirtle [Wed, 15 Mar 2023 08:33:31 +0000 (09:33 +0100)]
Add semantic equivalence to UnfoldingEvent
Two UnfoldingEvents are considered to be equivalent
if they have the same associated action (same actor,
type, and times_considered) and the same immediate history.
Semantic equivalence is required since any given
event may appear in the extension set of several configurations
that UDPOR decides to explore, and thus we must be able
to determine if a newly-computed event has already been
seen before
Maxwell Pirtle [Wed, 15 Mar 2023 08:05:11 +0000 (09:05 +0100)]
Add comments in K-partial alternatives computation
The computation of k-partial alternatives was added in
a prior commit. This commit implements the function
`Configuration::is_compatible_with(const History&)`
which is used during the computation of K-partial alternatives
to determine which events go with which spikes.
Note: The implementation that currently exists for
K-partial alternatives is a first go at an implementation
of the algorithm. There are clear spots within the computation
where performance may be improved with some more clever ideas.
For now, we're working towards a proof-of-concept: we can
optimize more heavily later
Arnaud Giersch [Tue, 14 Mar 2023 15:51:08 +0000 (16:51 +0100)]
Remove comments about non-existent support for smpi/privatization in MC.
[ci-skip]
Arnaud Giersch [Tue, 14 Mar 2023 15:34:28 +0000 (16:34 +0100)]
Really check the privatization option in the MCed SMPI app.
Maxwell Pirtle [Tue, 14 Mar 2023 09:37:13 +0000 (10:37 +0100)]
Add first go at implementation of K-partial alternatives
The algorithm for computing K-partial alternatives was
added in this commit. It doesn't look super clean
at this point, nor is the code as efficient as it
could be, but it certainly is a first go at an
implementation for K-partial alternatives. Subsequent
commits will attempt to clean up the code and will
implement a version of `Configuration::is_compatible_with()`
which currently remains unimplemented
Arnaud Giersch [Tue, 14 Mar 2023 08:24:44 +0000 (09:24 +0100)]
Inform if JSON lib is found.
Martin Quinson [Mon, 13 Mar 2023 23:43:27 +0000 (00:43 +0100)]
Test for JSON before using it
Martin Quinson [Mon, 13 Mar 2023 23:43:04 +0000 (00:43 +0100)]
Cosmetics
Martin Quinson [Mon, 13 Mar 2023 21:47:12 +0000 (22:47 +0100)]
fix make distcheck
Fred Suter [Mon, 13 Mar 2023 21:12:47 +0000 (21:12 +0000)]
Merge branch 'master' into 'master'
Add wfformat json DAG loader and DAG doc
See merge request simgrid/simgrid!137