X-Git-Url: http://bilbo.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/f43c766932553c0f32c9e681ea5b7032003c29a7..20984b0bb3a1e3e5e213963d9182b1c15baba23c:/doc/doxygen/introduction.doc diff --git a/doc/doxygen/introduction.doc b/doc/doxygen/introduction.doc index de2fd4a6a6..0851aff91b 100644 --- a/doc/doxygen/introduction.doc +++ b/doc/doxygen/introduction.doc @@ -1,10 +1,6 @@ /*! @page introduction Introduction to SimGrid -This page does not really exist yet. In the meanwhile, please refer -to the tutorials on the project web page, looking for the "SimGrid -101" tutorial. - -SimGrid is a toolkit +[SimGrid](http://simgrid.gforge.inria.fr/) is a toolkit that provides core functionalities for the simulation of distributed applications in heterogeneous distributed environments. @@ -13,4 +9,477 @@ distributed and parallel application scheduling on distributed computing platforms ranging from simple network of workstations to Computational Grids. -*/ \ No newline at end of file +# Scenario +The goal of this practical session is to illustrate various usage of +the MSG interface. To this end we will use the following simple setting: + +> Assume we have a (possibly large) bunch of (possibly large) data to +> process and which originally reside on a server (a.k.a. master). For +> sake of simplicity, we assume all input file require the same amount +> of computation. We assume the server can be helped by a (possibly +> large) set of worker machines. What is the best way to organize the +> computations ? + +Although this looks like a very simple setting it raises several +interesting questions: + +- Which algorithm should the master use to send workload? + + The most obvious algorithm would be to send tasks to workers in a + round-robin fashion. This is the initial code we provide you. + + A less obvious one but probably more efficient would be to set up + a request mechanism where client first ask for tasks, which allows + the server to decide which request to answer and possibly to send + the tasks to the fastest machines. Maybe you can think of a + smarter mechanism... + +- How much tasks should the client ask for? + + Indeed, if we set up a request mechanism and that workers only + send request whenever they have no more task to process, they are + likely to be poorly exploited since they will have to wait for the + master to consider their request and for the input data to be + transferred. A client should thus probably request a pool of tasks + but if it requests too much task, it is likely to lead to a poor + load-balancing... + +- How is the quality of such algorithm dependent on the platform + characteristics? on the task characteristics? + + Whenever the input communication time is very small compared to + processing time and workers are homogeneous, it is likely that the + round-robin algorithm performs very well. Would it still hold true + when transfer time is not negligible and the platform is, say, + a volunteer computing system ? + +- The network topology interconnecting the master and the workers + may be quite complicated. How does such topology impact the + previous result? + + When data transfers are the bottleneck, it is likely that a good + modeling of the platform becomes essential, in which case, you may + want to be able to account for complex platform topologies. + +- Do the algorithms depend on a perfect knowledge of this + topology? + + Should we still use a flat master worker deployment or should we + use a + +- How is such algorithm sensitive to external workload variation? + + What if bandwidth, latency and power can vary with no warning? + Shouldn't you study whether your algorithm is sensitive to such + load variations? + +- Although an algorithm may be more efficient than another, how + does it interfere with other applications? + + As you can see, this very simple setting may need to evolve way + beyond what you initially imagined. + +
Premature optimization is the root of all evil. -- D.E.Knuth+ + Furthermore, writing your own simulator is much harder that what you + may imagine. This is why should rely on an established and flexible + one. + +The following figure is a screenshot of [triva][fn:1] visualizing a [SimGrid +simulation][fn:2] of two master worker applications (one in light gray and +the other in dark gray) running in concurrence and showing resource +usage over a long period of time. + +![Test](./sc3-description.png) + +# Prerequisites + +## Tutorials + +A lot of information on how to install and use Simgrid are +available on the [online documentation][fn:4] and in the tutorials: + +- http://simgrid.gforge.inria.fr/tutorials/simgrid-use-101.pdf +- http://simgrid.gforge.inria.fr/tutorials/simgrid-tracing-101.pdf +- http://simgrid.gforge.inria.fr/tutorials/simgrid-platf-101.pdf + +## Installing SimGrid + + sudo apt-get install libsimgrid-dev + +This tutorial requires simgrid 3.8 at least so you may need to get +the [debian packages](http://packages.debian.org/libsimgrid-dev). + +# Recommended Steps + +## Installing Viva + +This [software][fn:1] will be useful to make fancy graph or treemap +visualizations and get a better understanding of simulations. You +will first need to install pajeng: + +~~~~{.sh} +sudo apt-get install git cmake build-essential libqt4-dev libboost-dev freeglut3-dev ; +git clone https://github.com/schnorr/pajeng.git +cd pajeng && mkdir -p build && cd build && cmake ../ -DCMAKE_INSTALL_PREFIX=$HOME && make -j install +cd ../../ +~~~~ + +Then you can install viva. + +~~~~{.sh} +sudo apt-get install libboost-dev libconfig++-dev libconfig8-dev libgtk2.0-dev freeglut3-dev +git clone https://github.com/schnorr/viva.git +cd viva && mkdir -p build_graph && cd build_graph && cmake ../ -DTUPI_LIBRARY=ON -DVIVA=ON -DCMAKE_INSTALL_PREFIX=$HOME && make -j install +cd ../../ +~~~~ + +## Installing Paje + +This [software][fn:5] provides a Gantt-chart visualization. + +~~~~{.sh} +sudo apt-get install paje.app +~~~~ + +## Installing Vite + +This software provides a [Gantt-chart visualization][fn:6]. + +~~~~{.sh} +sudo apt-get install vite +~~~~ + +# Let's get Started +## Setting up and Compiling + +The corresponding archive with all source files and platform files +can be obtained [here](http://simgrid.gforge.inria.fr/tutorials/msg-tuto/msg-tuto.tgz). + +~~~~{.sh} +tar zxf msg-tuto.tgz +cd msg-tuto/src +make +~~~~ + +As you can see, there is already a nice Makefile that compiles +everything for you. Now the tiny example has been compiled and it +can be easily run as follows: + +~~~~{.sh} +./masterworker0 platforms/platform.xml deployment0.xml 2>&1 +~~~~ + +If you create a single self-content C-file named foo.c, the +corresponding program will be simply compiled and linked with +SimGrid by typing: + +~~~~{.sh} +make foo +~~~~ + +For a more "fancy" output, you can try: + +~~~~{.sh} +./masterworker0 platforms/platform.xml deployment0.xml 2>&1 | simgrid-colorizer +~~~~ + +For a really fancy output, you should use [viva/triva][fn:1]: + +~~~~{.sh} +./masterworker0 platforms/platform.xml deployment0.xml --cfg=tracing:yes \ + --cfg=tracing/uncategorized:yes --cfg=viva/uncategorized:uncat.plist +LANG=C ; viva simgrid.trace uncat.plist +~~~~ + +For a more classical Gantt-Chart visualization, you can produce a +[Paje][fn:5] trace: + +~~~~{.sh} +./masterworker0 platforms/platform.xml deployment0.xml --cfg=tracing:yes \ + --cfg=tracing/msg/process:yes +LANG=C ; Paje simgrid.trace +~~~~ + +Alternatively, you can use [vite][fn:6]. + +~~~~{.sh} +./masterworker0 platforms/platform.xml deployment0.xml --cfg=tracing:yes \ + --cfg=tracing/msg/process:yes --cfg=tracing/basic:yes +vite simgrid.trace +~~~~ + +## Getting Rid of Workers in the Deployment File + +In the previous example, the deployment file `deployment0.xml` +is tightly connected to the platform file `platform.xml` and a +worker process is launched on each host: + +~~~~{.xml} + + +