relecture ingrid

author couturie <couturie@carcariass.(none)>

Sun, 16 Jan 2011 09:02:41 +0000 (10:02 +0100)

committer couturie <couturie@carcariass.(none)>

Sun, 16 Jan 2011 09:02:41 +0000 (10:02 +0100)
author couturie <couturie@carcariass.(none)>
Sun, 16 Jan 2011 09:02:41 +0000 (10:02 +0100)
committer couturie <couturie@carcariass.(none)>
Sun, 16 Jan 2011 09:02:41 +0000 (10:02 +0100)
diff --git a/gpc2011.tex b/gpc2011.tex

index fc4e16a..8c7d299 100644 (file)
--- a/gpc2011.tex
+++ b/gpc2011.tex
@@ -93,9 +93,9 @@ dose distribution during treatment planning, a compromise must be
  found between the precision and the speed of calculation. Current
  techniques, using analytic methods, models and databases, are rapid
  but lack precision. Enhanced precision can be achieved by using
-calculation codes based, for example, on Monte Carlo methods. The main
-drawback of these methods is their computation times which can be
-rapidly huge. In \cite{NIMB2008} the authors proposed a novel approach, called
+calculation codes based, for example, on the Monte Carlo methods. The main
+drawback of these methods is their computation times which can 
+rapidly become huge. In \cite{NIMB2008} the authors proposed a new approach, called
  Neurad, using neural networks. This approach is based on the
  collaboration of computation codes and multi-layer neural networks
  used as universal approximators. It provides a fast and accurate
@@ -103,8 +103,8 @@ evaluation of radiation doses in any given environment for given
  irradiation parameters. As the learning step is often very time
  consuming, in \cite{AES2009} the authors proposed a parallel
  algorithm that enables to decompose the learning domain into
-subdomains. The decomposition has the advantage to significantly
-reduce the complexity of the target functions to approximate.
+subdomains. The decomposition has the advantage of significantly
+reducing the complexity of the target functions to approximate.
  
  Now, as there exist several classes of distributed/parallel
  architectures (supercomputers, clusters, global computing\dots{}) we
@@ -112,17 +112,17 @@ have to choose the best suited one for the parallel Neurad
  application. The volunteer (or global) computing model seems to be an
  interesting approach. Here, the computing power is obtained by
  aggregating unused (or volunteer) public resources connected to the
-Internet. For our case, we can imagine for example, that a part of the
+Internet. In our case, we can imagine, for example, that a part of the
  architecture will be composed of some of the different computers of
-the hospital. This approach presents the advantage to be clearly
+the hospital. This approach presents the advantage of being clearly
  cheaper than a more dedicated approach like the use of supercomputers
  or clusters. Furthermore and as we will see in the remainder, the
-studied parallel algorithm fits well this computation model.
+studied parallel algorithm corresponds very well to this computation model.
  
  The aim of this paper is to propose and evaluate a gridification of
  the Neurad application (more precisely, of the most time consuming
  part, the learning step) using a volunteer computing approach. For
-this, we focus on the XtremWeb-CH environment\cite{xwch}. We choose
+this, we focus on the XtremWeb-CH environment\cite{xwch}. We chose
  this environment because it tackles the centralized aspect of other
  global computing environments such as XtremWeb\cite{xtremweb} or
  Seti\cite{seti}. It tends to a peer-to-peer approach by distributing
@@ -152,13 +152,13 @@ The \emph{Neurad}~\cite{Neurad} project presented in this paper takes place in a
  multi-disciplinary project, involving medical physicists and computer scientists
  whose goal is to enhance the  treatment planning of cancerous tumors by external
  radiotherapy.   In our  previous works~\cite{RADIO09,ICANN10,NIMB2008},  we have
-proposed  an  original approach  to  solve  scientific  problems whose  accurate
+proposed  an  original approach  to  solving  scientific  problems whose  accurate
  modeling and/or  analytical description are  difficult. That method is  based on
  the collaboration of  computational codes and neural networks  used as universal
  interpolator. Thanks to that method,  the \emph{Neurad} software provides a fast
  and accurate  evaluation of radiation  doses in any given  environment (possibly
  inhomogeneous) for  given irradiation  parameters. We have  shown in  a previous
-work (\cite{AES2009}) the interest to use a distributed algorithm for the neural
+work (\cite{AES2009}) the interest of using a distributed algorithm for the neural
  network learning. We  use a classical RPROP~\footnote{Resilient backpropagation}
  algorithm with a  HPU~\footnote{High order processing units} topology  to do the
  training of our neural network.
@@ -188,7 +188,7 @@ criterion is that the result must be obtained in an homogeneous environment.
  % \end{figure}
  
  The secondary stage  of the {\it{Neurad}} project is the  learning step and this
-is  the most time  consuming step.  This step  is performed  off-line but  it is
+is  the most time  consuming step.  This step  is performed  offline but  it is
  important to  reduce the time used for  the learning process to  keep a workable
  tool. Indeed, if the learning time is  too huge (for the moment, this time could
  reach one week for a limited domain), this process should not be launched at any
@@ -196,8 +196,7 @@ time,  but only  when a  major modification  occurs in  the environment,  like a
  change  of  context for  instance.  However, it  is  interesting  to update  the
  knowledge of the neural network, by  using the learning process, when the domain
  evolves (evolution in material used for  the prosthesis or evolution on the beam
-(size, shape or energy)). The learning time is related to the volume of data who
-could be  very important in  a real  medical context.  A  work has been  done to
+(size, shape or energy)). The learning time is related to the volume of data which could be  very important in  a real  medical context.  Some work has been  done to
  reduce this  learning time with the  parallelization of the  learning process by
  using a partitioning method of the global dataset. The goal of this method is to
  train  many neural networks  on sub-domains  of the  global dataset.  After this
@@ -216,7 +215,7 @@ response for the global domain of study.
  % j'ai relu mais pas vu le probleme 
   
  However, performing the learning on  sub-domains constituting a partition of the
-initial domain is  not satisfying according to the quality  of the results. This
+initial domain may  not be satisfying depending on  the chosen quality  of the results. This
  comes from the fact that the accuracy of the approximation performed by a neural
  network is not constant over the learned domain. Thus, it is necessary to use an
  overlapping  of   the  sub-domains.  The   overall  principle  is   depicted  in
@@ -224,13 +223,10 @@ Figure~\ref{fig:overlap}.  In this  way,  each sub-network  has an  exploitation
  domain  smaller than its  training domain  and the  differences observed  at the
  borders  are  no  longer  relevant.   Nonetheless,  in  order  to  preserve  the
  performance  of the parallel  algorithm, it  is important  to carefully  set the
-overlapping  ratio $\alpha$.  It  must be  large  enough to  avoid the  border's
+overlapping  ratio $\alpha$.  It  must both be  large  enough to  avoid the  border's
  errors,  and as  small  as  possible to  limit  the size  increase  of the  data
  subsets~\cite{AES2009}.
  
-%(Qu'en est-il pour nos test ?).
-% Ce paramètre a deja été etudié dans un précédent papier, il a donc choisi d'être fixe
-% pour ces tests-ci.
  
  
  \section{The XtremWeb-CH environment}
@@ -249,14 +245,13 @@ density. This part is out of the scope of this paper.
  %Multiple ``views'' can be
  %superposed in order to obtain a more accurate learning. 
  
-The second step of the application, and the most time consuming, is
-the learning itself. This is the one which has been parallelized,
-using the XWCH environment. As exposed in the section 2, the
-parallelization relies on a partitioning of the global
-dataset. Following this partitioning all learning tasks are executed
-in parallel independently with their own local data part, with no
-communication, following the fork/join model. Clearly, this
-computation fits well with the model of the chosen middleware.
+The second step of the application, and the most time consuming, is the learning
+in  itself.  This  is the  one  which  has  been  parallelized, using  the  XWCH
+environment.  As  exposed  in  section   2,  the  parallelization  relies  on  a
+partitioning  of the global  dataset. Following  this partitioning  all learning
+tasks are  independently executed  in parallel with  their own local  data part,
+with no communication, following  the fork/join model. Clearly, this computation
+fits well with the model of the chosen middleware.
  
  \begin{figure}[ht]
    \centering
@@ -296,8 +291,8 @@ data and on a real volunteer computing testbed.
  \subsubsection{Experimental conditions}
  \label{sec:neurad_cond}
  
-The size of the input data is about 2.4Gb. In order to avoid that
-noise appears and disturbs the learning process, these data can be
+The size of the input data is about 2.4Gb. In order to avoid 
+noises to appear and disturb the learning process, these data can be
  divided into, at most, 25 parts. This generates input data parts of
  about 15Mb (in a compressed format). The output data, which are
  retrieved after the process, are about 30Kb for each part. We used two
@@ -310,22 +305,22 @@ distinct deployments of XWCH:
    in Belfort, France.
  
  \item The second deployment, called ``local XWCH'' is a local
-  deployment where both coordinator, warehouses and workers were in
-  the same local cluster.  
+  deployment where  coordinator, warehouses and workers were, in
+  the same local cluster, at the same time.  
  
  \end{enumerate}
-For both deployments, le local cluster is a campus cluster and during
+For both deployments, the local cluster is a campus cluster and during
  the day these machines were used by students of the Computer Science
  Department of the IUT of Belfort.  Unfortunately, the data
  decomposition limitation does not allow us to use more than 25
  computers (XWCH workers).
  
-In order to evaluate the overhead induced by the use of the platform
-we have furthermore compared the execution of the Neurad application
-with and without the XWCH platform. For the latter case, we mean that the
-testbed consists only in workers deployed with their respective data
-by the use of shell scripts. No specific middleware was used and the
-workers were in the same local cluster.
+In order  to evaluate the overhead  induced by the  use of the platform  we have
+furthermore compared  the execution of  the Neurad application with  and without
+the XWCH platform. For  the latter case, we want to insist  on the fact that the
+testbed consists only in workers deployed  with their respective data by the use
+of shell  scripts. No specific middleware was  used and the workers  were in the
+same local cluster.
  
  Finally, five computation precisions were used: $1e^{-1}$, $0.75e^{-1}$,
  $0.50e^{-1}$, $0.25e^{-1}$, and $1e^{-2}$.
@@ -338,7 +333,7 @@ $0.50e^{-1}$, $0.25e^{-1}$, and $1e^{-2}$.
  Table \ref{tab:neurad_res} presents the execution times of the Neurad
  application on 25 machines with XWCH (local and distributed
  deployment) and without XWCH. These results correspond to the measures
-of the same steps for both kinds of execution, i.e. sending of local
+of the same steps for both kinds of execution, i.e. the sending of local
  data and the executable, the learning process, and retrieving the
  results. Results represent the average time of $5$ executions.
  
@@ -380,16 +375,16 @@ results. Results represent the average time of $5$ executions.
  As we can see, in the case of a local deployment the overhead induced
  by the use of the XWCH platform is about $7\%$. It is clearly a low
  overhead. Now, for the distributed deployment, the overhead is about
-$34\%$. Regarding to the benefits of the platform, it is a very
+$34\%$. Regarding  the benefits of the platform, it is a very
  acceptable overhead which can be explained by the following points.
  
  First, we point out that the conditions of executions are not really
-identical between with and without XWCH contexts. For this last one,
-though the same steps were done, all transfer processes are inside a
+identical between, with and without, XWCH contexts. For this last one,
+though the same steps were achieved out, all transfer processes are inside a
  local cluster with a high bandwidth and a low latency. Whereas when
  using XWCH, all transfer processes (between datawarehouses, workers,
  and the coordinator) used a wide network area with a smaller
-bandwidth. In addition, in executions without XWCH, all the machines
+bandwidth. In addition, in the executions without XWCH, all the machines
  started immediately the computation, whereas when using the XWCH
  platform, a latency is introduced by the fact that a computation
  starts on a machine, only when this one requests a task.
@@ -408,14 +403,14 @@ application, the Neurad application. This radiotherapy application
  tries to optimize the irradiated dose distribution within a
  patient. Based on a multi-layer neural network, this application
  presents a very time consuming step, i.e. the learning step. Due to the
-computing characteristics of this step, we choose to parallelize it
+computing characteristics of this step, we have chosen to parallelize it
  using the XtremWeb-CH volunteer computing environment. Obtained
  experimental results show good speed-ups and underline that overheads
  induced by XWCH are very acceptable, letting it be a good candidate
  for deploying parallel applications over a volunteer computing environment.
  
-Our future works include the testing of the application on a more
-large scale testbed. This implies, the choice of a data input set
+Our future works include the testing of the application on a 
+larger scale testbed. This implies, the choice of a data input set
  allowing a finer decomposition. Unfortunately, this choice of input
  data is not trivial and relies on a large number of parameters.
  
diff --git a/xwch.tex b/xwch.tex

index bdc94e8..ab215cb 100644 (file)
--- a/xwch.tex
+++ b/xwch.tex
@@ -1,21 +1,21 @@
  %------------------------------------
  % The XtremWeb-CH environment
  %------------------------------------
-XtremWeb-CH (XWCH) is a volunteer computing inspired, large-scale
-computing platform for distributed applications. It consists of three
+XtremWeb-CH (XWCH) is a volunteer computing inspired large-scale
+computing platform for distributed applications. It consists in three
  components: one coordinator, a set of workers and at least one
  warehouse. Client programs use these components.
  
  The coordinator is the main component of the XWCH platform. It
  controls user access and schedules jobs to workers. It provides a web
  interface for managing jobs and users, and a set of web
-services. These are user service and worker/warehouse services
+services. These are user services and worker/warehouse services
  implemented using WSDL (Web Service Description Language)
  \cite{WebServ2002}, that simplifies client development for languages
  that support it (and most popular programming languages do).
  
  A worker is a Java daemon that runs on the user machine. Assumed to be
-volatile, the workers report periodically themselves to the
+volatile, the workers periodically report themselves to the
  coordinator, accept jobs, retrieve input, compute jobs, and store the
  results of the computation on warehouses. If the coordinator does not
  receive a signal from a worker, it will simply remove it from the
@@ -41,7 +41,7 @@ Job submission is done by a client program which is written using a
  flexible API, available for Java and C/C++ programs. The client
  program runs on a “client node” and calls the user services to submit
  jobs (Figure \ref{xwch}, (1)). The main flexibility provided by the use of this
-architecture is to control and generate dynamically jobs especially
+architecture is to control and dynamically  generate  jobs especially
  when their number cannot be known in advance. Communications between
  the coordinator and the workers are always initiated by the workers
  following a pull model (Figure \ref{xwch}, (2)):
@@ -66,7 +66,7 @@ typical 32-bit GNU/Linux computer are:
   
  Experiments presented in \cite{ccgridpaper} show that the
  performance of XWCH is comparable with Condor \cite{Condor1988},
-another non-intrusive computing system that has similar functionality
+another non-intrusive computing system that has similar functionalities
  but is somewhat more difficult to install.
  
  The main characteristics of the new version of XWCH, compared to
author	couturie <couturie@carcariass.(none)>
	Sun, 16 Jan 2011 09:02:41 +0000 (10:02 +0100)
committer	couturie <couturie@carcariass.(none)>
	Sun, 16 Jan 2011 09:02:41 +0000 (10:02 +0100)
gpc2011.tex		patch \| blob \| history
xwch.tex		patch \| blob \| history