gpc2011.tex

   1 \documentclass{llncs}
   2 %\usepackage{latex8}
   3 %\usepackage{times}
   4 %\documentclass[a4paper,11pt]{article}
   5 %\usepackage{fullpage}
   6 \usepackage[T1]{fontenc}
   7 \usepackage[utf8]{inputenc}
   8 \usepackage{graphicx,subfigure,graphics}
   9 \usepackage{epsfig}
  10 %\usepackage[usenames]{color}
  11 %\usepackage{latexsym,stmaryrd}
  12 %\usepackage{amsfonts,amssymb}
  13 \usepackage{verbatim,theorem,moreverb}
  14 %\usepackage{float,floatflt}
  15 \usepackage{boxedminipage}
  16 \usepackage{url}
  17 %\usepackage{psfig}
  18 \usepackage{amsmath}
  19 \usepackage{amsfonts}
  20 \usepackage{amssymb}
  21 \usepackage{algorithm}
  22 \usepackage{algorithmic}
  23 %\usepackage{floatfig}
  24 %\usepackage{picins}
  25
  26
  27
  28 \def\sfixme#1{\fbox{\textbf{FIXME: }#1}}
  29
  30 \newcommand{\fixme}[1]{%
  31   \begin{center}
  32     \begin{boxedminipage}{.8\linewidth}
  33       \textsl{{\bf #1}}
  34     \end{boxedminipage}
  35   \end{center}
  36 }
  37 \newcommand{\FIXME}[1]{\marginpar[\null\hspace{2cm} FIXME]{FIXME} \fixme{#1}}
  38
  39 %\psfigurepath{.:fig:IMAGES}
  40 \graphicspath{{.}{fig/}{IMAGES/}}
  41
  42 %\initfloatingfigs
  43
  44 \begin{document}
  45
  46 \title{Gridification of a Radiotherapy Dose Computation Application with the XtremWeb-CH Environment}
  47
  48 \author{Nabil Abdennhader\inst{1} \and Raphaël Couturier\inst{1} \and David \and
  49   Julien  Henriet\inst{2} \and  Laiymani\inst{1}  \and Sébastien  Miquée\inst{1}
  50   \and Marc Sauget\inst{2}}
  51
  52 \institute{Laboratoire d'Informatique de l'universit\'{e}
  53   de Franche-Comt\'{e} \\
  54   IUT Belfort-Montbéliard, Rue Engel Gros, 90016 Belfort - France \\
  55 \email{raphael.couturier, david.laiymani, sebastien.miquee@univ-fcomte.fr}
  56 \and
  57  FEMTO-ST, ENISYS/IRMA, F-25210 Montb\'{e}liard , FRANCE\\
  58 }
  59 %\email{\texttt{[laiymani]@lifc.univ-fcomte.fr}}}
  60
  61
  62 \maketitle
  63
  64 \begin{abstract}
  65
  66 \end{abstract}
  67
  68 %-------------INTRODUCTION--------------------
  69 \section{Introduction}
  70
  71 The use of distributed architectures for solving large scientific
  72 problems seems to become mandatory in a lot of cases. For example, in
  73 the domain of radiotherapy dose computation the problem is
  74 crucial. The main goal of external beam radiotherapy is the treatment
  75 of tumors while minimizing exposure to healthy tissue. Dosimetric
  76 planning has to be carried out in order to optimize the dose
  77 distribution within the patient is necessary. Thus, to determine the
  78 most accurate dose distribution during treatment planning, a
  79 compromise must be found between the precision and the speed of
  80 calculation. Current techniques, using analytic methods, models and
  81 databases, are rapid but lack precision. Enhanced precision can be
  82 achieved by using calculation codes based, for example, on Monte Carlo
  83 methods. In [] the authors proposed a novel approach using neural
  84 networks. This approach is based on the collaboration of computation
  85 codes and multi-layer neural networks used as universal
  86 approximators. It provides a fast and accurate evaluation of radiation
  87 doses in any given environment for given irradiation parameters. As
  88 the learning step is often very time consuming, in \cite{bcvsv08:ip}
  89 the authors proposed a parallel algorithm that enable to decompose the
  90 learning domain into subdomains. The decomposition has the advantage
  91 to significantly reduce the complexity of the target functions to
  92 approximate.
  93
  94 Now, as there exist several classes of distributed/parallel
  95 architectures (supercomputers, clusters, global computing...)  we have
  96 to choose the best suited one for the parallel Neurad application.
  97 The Global or Volunteer Computing model seems to be an interesting
  98 approach. Here, the computing power is obtained by aggregating unused
  99 (or volunteer) public resources connected to the Internet. For our
 100 case, we can imagine for example, that a part of the architecture will
 101 be composed of some of the different computers of the hospital. This
 102 approach present the advantage to be clearly cheaper than a more
 103 dedicated approach like the use of supercomputers or clusters.
 104
 105 The aim of this paper is to propose and evaluate a gridification of
 106 the Neurad application (more precisely, of the most time consuming
 107 part, the learning step) using a Global Computing approach. For this,
 108 we focus on the XtremWeb-CH environment []. We choose this environment
 109 because it tackles the centralized aspect of other global computing
 110 environments such as XtremWeb [] or Seti []. It tends to a
 111 peer-to-peer approach by distributing some components of the
 112 architecture. For instance, the computing nodes are allowed to
 113 directly communicate. Experiments were conducted on a real Global
 114 Computing testbed. The results are very encouraging. They exhibit an
 115 interesting speed-up and show that the overhead induced by the use of
 116 XtremWeb-CH is very acceptable.
 117
 118 The paper is organized as follows. In Section 2 we present the Neurad
 119 application and particularly it most time consuming part i.e. the
 120 learning step. Section 3 details the XtremWeb-CH environment while in
 121 Section 4 we expose the gridification of the Neurad
 122 application. Experimental results are presented in Section 5 and we
 123 end in Section 6 by some concluding remarks and perspectives.
 124
 125 \section{The Neurad application}
 126
 127 \begin{figure}[http]
 128   \centering
 129   \includegraphics[width=0.7\columnwidth]{figures/neurad.pdf}
 130   \caption{The Neurad project}
 131   \label{f_neurad}
 132 \end{figure}
 133
 134 The \emph{Neurad}~\cite{Neurad} project presented in this paper takes
 135 place in a multi-disciplinary project, involving medical physicists
 136 and computer scientists whose goal is to enhance the treatment
 137 planning of cancerous tumors by external radiotherapy. In our
 138 previous works~\cite{RADIO09,ICANN10,NIMB2008}, we have proposed an
 139 original approach to solve scientific problems whose accurate modeling
 140 and/or analytical description are difficult. That method is based on
 141 the collaboration of computational codes and neural networks used as
 142 universal interpolator. Thanks to that method, the \emph{Neurad}
 143 software provides a fast and accurate evaluation of radiation doses in
 144 any given environment (possibly inhomogeneous) for given irradiation
 145 parameters. We have shown in a previous work (\cite{AES2009}) the
 146 interest to use a distributed algorithm for the neural network
 147 learning. We use a classical RPROP algorithm with a HPU topology to do
 148 the training of our neural network.
 149
 150 Figure~\ref{f_neurad} presents the {\it{Neurad}} scheme. Three parts
 151 are clearly independent: the initial data production, the learning
 152 process and the dose deposit evaluation. The first step, the data
 153 production, is outside the {\it{Neurad}} project. They are many
 154 solutions to obtain data about the radiotherapy treatments like the
 155 measure or the simulation. The only essential criterion is that the
 156 result must be obtain in a homogeneous environment. We have chosen to
 157 use only a Monte Carlo simulation because this kind of tool is the
 158 reference in the radiotherapy domains. The advantages to use data
 159 obtained with a Monte Carlo simulator are the following: accuracy,
 160 profusion, quantified error and regularity of measure points. But,
 161 there exist also some disagreements and the most important is the
 162 statistical noise, forcing a data post treatment. Figure~\ref{f_tray}
 163 presents the general behavior of a dose deposit in water.
 164
 165
 166 \begin{figure}[http]
 167   \centering
 168   \includegraphics[width=0.7\columnwidth]{figures/testC.pdf}
 169   \caption{Dose deposit by a photon beam  of 24 mm of width in water (normalized value).}
 170   \label{f_tray}
 171 \end{figure}
 172
 173 The secondary stage of the {\it{Neurad}} project is the learning step
 174 and this is the most time consuming step. This step is off-line but it
 175 is important to reduce the time used for the learning process to keep
 176 a workable tool. Indeed, if the learning time is too huge (for the
 177 moment, this time could reach one week for a limited domain), this
 178 process should not be launched at any time, but only when a major
 179 modification occurs in the environment, like a change of context for
 180 instance. However, it is interesting to update the knowledge of the
 181 neural network, by using the learning process, when the domain evolves
 182 (evolution in material used for the prosthesis or evolution on the
 183 beam (size, shape or energy)). The learning time is related to the
 184 volume of data who could be very important in a real medical context.
 185 A work has been done to reduce this learning time with the
 186 parallelization of the learning process by using a partitioning method
 187 of the global dataset. The goal of this method is to train many neural
 188 networks on sub-domains of the global dataset. After this training,
 189 the use of these neural networks all together allows to obtain a
 190 response for the global domain of study.
 191
 192
 193 \begin{figure}[h]
 194   \centering
 195   \includegraphics[width=0.5\columnwidth]{figures/overlap.pdf}
 196   \caption{Overlapping for a sub-network  in a two-dimensional domain with ratio
 197     $\alpha$.}
 198   \label{fig:overlap}
 199 \end{figure}
 200
 201
 202 However, performing the learning on sub-domains constituting a
 203 partition of the initial domain is not satisfying according to the
 204 quality of the results. This comes from the fact that the accuracy of
 205 the approximation performed by a neural network is not constant over
 206 the learned domain. Thus, it is necessary to use an overlapping of
 207 the sub-domains. The overall principle is depicted in
 208 Figure~\ref{fig:overlap}. In this way, each sub-network has an
 209 exploitation domain smaller than its training domain and the
 210 differences observed at the borders are no longer relevant.
 211 Nonetheless, in order to preserve the performance of the parallel
 212 algorithm, it is important to carefully set the overlapping ratio
 213 $\alpha$. It must be large enough to avoid the border's errors, and
 214 as small as possible to limit the size increase of the data subsets.
 215
 216
 217
 218 \section{The XtremWeb-CH environment}
 219 \input{xwch.tex}
 220
 221 \section{Experimental results}
 222 \section{Conclusion and future works}
 223
 224
 225
 226 \bibliographystyle{plain}
 227 \bibliography{biblio}
 228
 229
 230
 231 \end{document}