15

[hdrcouchot.git] / chaosANN.tex
diff --git a/chaosANN.tex b/chaosANN.tex

index 22d5258134aa457f6a65b4edd2cc0e38a53d6ba8..239251ecbcdda8b5ea03ba12ee8361656930cb4c 100644 (file)
--- a/chaosANN.tex
+++ b/chaosANN.tex
@@ -158,7 +158,7 @@ de vérifier si le graphe d'itérations
  $\textsc{giu}(f)$ est fortement connexe.
  
  
-\section{Un réseau de neurones peut-il approximer un 
+\section{Un réseau de neurones peut-il approximer
  des itération unaires chaotiques?}
  
  Cette section s'intéresse à étudier le comportement d'un réseau de neurones 
@@ -176,7 +176,7 @@ de de fonction à quatre éléments.
  \subsection{Construction du réseau} 
  \label{section:translation}
  
-On considère par exemple les deux fonctions $f$ and $g$ de0 $\Bool^4$
+On considère par exemple les deux fonctions $f$ and $g$ de $\Bool^4$
  dans $\Bool^4$ définies par:
  
  \begin{eqnarray*}
@@ -268,290 +268,162 @@ ont montré qu'un MLP pouvait apprendre la dynamique du circuit de Chua.
  Ce réseau avec rétropropagation est composé de  deux couches 
  et entrainé à l'aide d'une  propagation arrière Bayesienne.
  
-In  these experiments  we consider  MLPs  having one  hidden layer  of
-sigmoidal  neurons  and  output   neurons  with  a  linear  activation
-function.     They    are    trained    using    the    Limited-memory
-Broyden-Fletcher-Goldfarb-Shanno quasi-newton algorithm in combination
-with the Wolfe linear search.  The training process is performed until
-a maximum number of epochs  is reached.  To prevent overfitting and to
-estimate the  generalization performance we use  holdout validation by
-splitting the  data set into  learning, validation, and  test subsets.
-These subsets  are obtained through  random selection such  that their
-respective size represents 65\%, 10\%, and 25\% of the whole data set.
-
-Several  neural  networks  are  trained  for  both  iterations  coding
-schemes.   In  both  cases   iterations  have  the  following  layout:
-configurations of  four components and  strategies with at  most three
-terms. Thus, for  the first coding scheme a data  set pair is composed
-of 6~inputs and 5~outputs, while for the second one it is respectively
-3~inputs and 2~outputs. As noticed at the end of the previous section,
-this  leads to  data sets  that  consist of  2304~pairs. The  networks
-differ  in the  size of  the hidden  layer and  the maximum  number of
-training epochs.  We remember that  to evaluate the ability  of neural
-networks to  predict a  chaotic behavior for  each coding  scheme, the
-trainings of two data sets, one of them describing chaotic iterations,
-are compared.
-
-Thereafter we give,  for the different learning setups  and data sets,
-the mean prediction success rate obtained for each output. Such a rate
-represents the percentage of  input-output pairs belonging to the test
-subset  for  which  the   corresponding  output  value  was  correctly
-predicted.   These values are  computed considering  10~trainings with
-random  subsets  construction,   weights  and  biases  initialization.
-Firstly, neural networks having  10 and 25~hidden neurons are trained,
-with   a  maximum   number  of   epochs  that   takes  its   value  in
-$\{125,250,500\}$  (see Tables~\ref{tab1} and  \ref{tab2}).  Secondly,
-we refine the second coding scheme by splitting the output vector such
-that   each  output   is  learned   by  a   specific   neural  network
-(Table~\ref{tab3}). In  this last  case, we increase  the size  of the
-hidden layer up to 40~neurons and we consider larger number of epochs.
+Le choix de l'achitecture du réseau ainsi que de la méthode d'apprentissage 
+ont été détaillé dans~\cite{bcgs12:ij}.
+En pratique, nous avons considéré des configurations de
+quatre éléments booléens 
+et une stratégie fixe de longueur 3.
+Pour le premier codage, nous avons ainsi 6~entrées et 5~sorties
+tandis que pour le second, uniquement  3 entrées et 2 sorties.
+Cela engendre ainsi 2304~combinaisons possibles comme détaillé à la 
+section précédente.
+
+
  
  \begin{table}[htbp!]
-\caption{Prediction success rates for configurations expressed as boolean vectors.}
-\label{tab1}
  \centering {\small
  \begin{tabular}{|c|c||c|c|c|}
  \hline 
-\multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs, and one hidden layer} \\
+\multicolumn{5}{|c|}{Topologie du réseau: 6~entrées, 5~sorties, 1~couche cachée} \\
  \hline
  \hline
-\multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{10 neurons} \\
+\multicolumn{2}{|c||}{Neurones cachés} & \multicolumn{3}{c|}{10 neurones} \\
  \cline{3-5} 
  \multicolumn{2}{|c||}{Epochs} & 125 & 250 & 500 \\ 
  \hline
-\multirow{6}{*}{\rotatebox{90}{Chaotic}}&Output~(1) & 90.92\% & 91.75\% & 91.82\% \\ 
-& Output~(2) & 69.32\% & 78.46\% & 82.15\% \\
-& Output~(3) & 68.47\% & 78.49\% & 82.22\% \\
-& Output~(4) & 91.53\% & 92.37\% & 93.4\% \\
+\multirow{6}{*}{\rotatebox{90}{Chaotique $g$ }}&Entrée~(1) & 90.92\% & 91.75\% & 91.82\% \\ 
+& Entrée~(2) & 69.32\% & 78.46\% & 82.15\% \\
+& Entrée~(3) & 68.47\% & 78.49\% & 82.22\% \\
+& Entrée~(4) & 91.53\% & 92.37\% & 93.4\% \\
  & Config. & 36.10\% & 51.35\% & 56.85\% \\
-& Strategy~(5) & 1.91\% & 3.38\% & 2.43\% \\
+& Stratégie~(5) & 1.91\% & 3.38\% & 2.43\% \\
  \hline
-\multirow{6}{*}{\rotatebox{90}{Non-chaotic}}&Output~(1) & 97.64\% & 98.10\% & 98.20\% \\
-& Output~(2) & 95.15\% & 95.39\% & 95.46\% \\
-& Output~(3) & 100\% & 100\% & 100\% \\
-& Output~(4) & 97.47\% & 97.90\% & 97.99\% \\
+\multirow{6}{*}{\rotatebox{90}{Non-chaotic $f$}}&Entrée~(1) & 97.64\% & 98.10\% & 98.20\% \\
+& Entrée~(2) & 95.15\% & 95.39\% & 95.46\% \\
+& Entrée~(3) & 100\% & 100\% & 100\% \\
+& Entrée~(4) & 97.47\% & 97.90\% & 97.99\% \\
  & Config. & 90.52\% & 91.59\% & 91.73\% \\
-& Strategy~(5) & 3.41\% & 3.40\% & 3.47\% \\
+& Stratégie~(5) & 3.41\% & 3.40\% & 3.47\% \\
  \hline
  \hline
-\multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{25 neurons} \\ %& \multicolumn{3}{|c|}{40 neurons} \\
-\cline{3-5} 
+\multicolumn{2}{|c||}{Neurones cachés} & \multicolumn{3}{c|}{25 neurones} \\
+\cline{3-5} \\%& \multicolumn{3}{|c|}{40 neurons} \\
  \multicolumn{2}{|c||}{Epochs} & 125 & 250 & 500 \\ %& 125 & 250 & 500 \\ 
  \hline
-\multirow{6}{*}{\rotatebox{90}{Chaotic}}&Output~(1) & 91.65\% & 92.69\% & 93.93\% \\ %& 91.94\% & 92.89\% & 94.00\% \\
-& Output~(2) & 72.06\% & 88.46\% & 90.5\% \\ %& 74.97\% & 89.83\% & 91.14\% \\
-& Output~(3) & 79.19\% & 89.83\% & 91.59\% \\ %& 76.69\% & 89.58\% & 91.84\% \\
-& Output~(4) & 91.61\% & 92.34\% & 93.47\% \\% & 82.77\% & 92.93\% & 93.48\% \\
+\multirow{6}{*}{\rotatebox{90}{Chaotique $g$}}&Entrée~(1) & 91.65\% & 92.69\% & 93.93\% \\ %& 91.94\% & 92.89\% & 94.00\% \\
+& Entrée~(2) & 72.06\% & 88.46\% & 90.5\% \\ %& 74.97\% & 89.83\% & 91.14\% \\
+& Entrée~(3) & 79.19\% & 89.83\% & 91.59\% \\ %& 76.69\% & 89.58\% & 91.84\% \\
+& Entrée~(4) & 91.61\% & 92.34\% & 93.47\% \\% & 82.77\% & 92.93\% & 93.48\% \\
  & Config. & 48.82\% & 67.80\% & 70.97\% \\%& 49.46\% & 68.94\% & 71.11\% \\
-& Strategy~(5) & 2.62\% & 3.43\% & 3.78\% \\% & 3.10\% & 3.10\% & 3.03\% \\
+& Stratégie~(5) & 2.62\% & 3.43\% & 3.78\% \\% & 3.10\% & 3.10\% & 3.03\% \\
  \hline
-\multirow{6}{*}{\rotatebox{90}{Non-chaotic}}&Output~(1) & 97.87\% & 97.99\% & 98.03\% \\ %& 98.16\% \\
-& Output~(2) & 95.46\% & 95.84\% & 96.75\% \\ % & 97.4\% \\
-& Output~(3) & 100\% & 100\% & 100\% \\%& 100\% \\
-& Output~(4) & 97.77\% & 97.82\% & 98.06\% \\%& 98.31\% \\
+\multirow{6}{*}{\rotatebox{90}{Non-chaotique $f$}}&Entrée~(1) & 97.87\% & 97.99\% & 98.03\% \\ %& 98.16\% \\
+& Entrée~(2) & 95.46\% & 95.84\% & 96.75\% \\ % & 97.4\% \\
+& Entrée~(3) & 100\% & 100\% & 100\% \\%& 100\% \\
+& Entrée~(4) & 97.77\% & 97.82\% & 98.06\% \\%& 98.31\% \\
  & Config. & 91.36\% & 91.99\% & 93.03\% \\%& 93.98\% \\
-& Strategy~(5) & 3.37\% & 3.44\% & 3.29\% \\%& 3.23\% \\
+& Stratégie~(5) & 3.37\% & 3.44\% & 3.29\% \\%& 3.23\% \\
  \hline
  \end{tabular}
  }
+\caption{Taux de prédiction lorsque les configurations sont exprimées comme un vecteur booléen.}
+\label{tab1}
  \end{table}
+Le tableau~\ref{tab1} synthétise les résultats obtenus avec le premier 
+codage. Sans surprise, la précision de la prédiction croit 
+avec l'Epoch et le nombre de neurones sur la couche cachée.
+Dans tous les cas, les résultats sont plus précis dans le cas non 
+chaotique que dans l'autre. Enfin, le réseau ne parvient jamais à
+apprendre le comportement de la stratégie.
  
-Table~\ref{tab1}  presents the  rates  obtained for  the first  coding
-scheme.   For  the chaotic  data,  it can  be  seen  that as  expected
-configuration  prediction becomes  better  when the  number of  hidden
-neurons and maximum  epochs increases: an improvement by  a factor two
-is observed (from 36.10\% for 10~neurons and 125~epochs to 70.97\% for
-25~neurons  and  500~epochs). We  also  notice  that  the learning  of
-outputs~(2)   and~(3)  is   more  difficult.    Conversely,   for  the
-non-chaotic  case the  simplest training  setup is  enough  to predict
-configurations.  For all these  feedforward network topologies and all
-outputs the  obtained results for the non-chaotic  case outperform the
-chaotic  ones. Finally,  the rates  for the  strategies show  that the
-different feedforward networks are unable to learn them.
-
-For  the  second  coding   scheme  (\textit{i.e.},  with  Gray  Codes)
-Table~\ref{tab2} shows  that any network learns about  five times more
-non-chaotic  configurations than  chaotic  ones.  As  in the  previous
-scheme,       the      strategies      cannot       be      predicted.
-Figures~\ref{Fig:chaotic_predictions}                              and
-\ref{Fig:non-chaotic_predictions} present the predictions given by two
-feedforward multilayer  perceptrons that were  respectively trained to
-learn chaotic  and non-chaotic data,  using the second  coding scheme.
-Each figure  shows for  each sample of  the test  subset (577~samples,
-representing 25\%  of the 2304~samples) the  configuration that should
-have been predicted and the one given by the multilayer perceptron. It
-can be  seen that for  the chaotic data  the predictions are  far away
-from the  expected configurations.  Obviously,  the better predictions
-for the non-chaotic data reflect their regularity.
-
-Let us now compare the  two coding schemes. Firstly, the second scheme
-disturbs  the   learning  process.   In   fact  in  this   scheme  the
-configuration is always expressed as  a natural number, whereas in the
-first one  the number  of inputs follows  the increase of  the Boolean
-vectors coding configurations. In this latter case, the coding gives a
-finer information on configuration evolution.
  \begin{table}[b]
-\caption{Prediction success rates for configurations expressed with Gray code}
-\label{tab2}
  \centering
  \begin{tabular}{|c|c||c|c|c|}
  \hline 
-\multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs, and one hidden layer} \\
+\multicolumn{5}{|c|}{Topologie du réseau: 3~entrées, 2~sorties, 1~couche cachée} \\
  \hline
  \hline
-& Hidden neurons & \multicolumn{3}{c|}{10 neurons} \\
+& Neurones cachés & \multicolumn{3}{c|}{10 neurones} \\
  \cline{2-5}
  & Epochs & 125 & 250 & 500 \\ %& 1000 
  \hline
-\multirow{2}{*}{Chaotic}& Config.~(1) & 13.29\% & 13.55\% & 13.08\% \\ %& 12.5\%
-& Strategy~(2) & 0.50\% & 0.52\% & 1.32\% \\ %& 1.42\%
+\multirow{2}{*}{Chaotique $g$}& Config.~(1) & 13.29\% & 13.55\% & 13.08\% \\ %& 12.5\%
+& Stratégie~(2) & 0.50\% & 0.52\% & 1.32\% \\ %& 1.42\%
  \hline
-\multirow{2}{*}{Non-Chaotic}&Config.~(1) & 77.12\% & 74.00\% & 72.60\% \\ %& 75.81\% 
-& Strategy~(2) & 0.42\% & 0.80\% & 1.16\% \\ %& 1.42\% 
+\multirow{2}{*}{Non-Chaotique $f$}&Config.~(1) & 77.12\% & 74.00\% & 72.60\% \\ %& 75.81\% 
+& Stratégie~(2) & 0.42\% & 0.80\% & 1.16\% \\ %& 1.42\% 
  \hline
  \hline
-& Hidden neurons & \multicolumn{3}{c|}{25 neurons} \\
+& Neurones cachés& \multicolumn{3}{c|}{25 neurones} \\
  \cline{2-5}
  & Epochs & 125 & 250 & 500 \\ %& 1000 
  \hline
-\multirow{2}{*}{Chaotic}& Config.~(1) & 12.27\% & 13.15\% & 13.05\% \\ %& 15.44\%
-& Strategy~(2) & 0.71\% & 0.66\% & 0.88\% \\ %& 1.73\%
+\multirow{2}{*}{Chaotique $g$ }& Config.~(1) & 12.27\% & 13.15\% & 13.05\% \\ %& 15.44\%
+& Stratégie~(2) & 0.71\% & 0.66\% & 0.88\% \\ %& 1.73\%
  \hline
-\multirow{2}{*}{Non-Chaotic}&Config.~(1) & 73.60\% & 74.70\% & 75.89\% \\ %& 68.32\%
-& Strategy~(2) & 0.64\% & 0.97\% & 1.23\% \\ %& 1.80\%
+\multirow{2}{*}{Non-Chaotique $f$}&Config.~(1) & 73.60\% & 74.70\% & 75.89\% \\ %& 68.32\%
+& Stratégie~(2) & 0.64\% & 0.97\% & 1.23\% \\ %& 1.80\%
  \hline
  \end{tabular}
+\caption{Taux de prédiction lorsque les configurations sont exprimées 
+à l'aide de codes de Gray.}
+\label{tab2}
  \end{table}
  
-\begin{figure}
-  \centering
-  \includegraphics[scale=0.5]{images/chaotic_trace2}
-  \caption {Second coding scheme - Predictions obtained for a chaotic test subset.}
-  \label{Fig:chaotic_predictions}
-\end{figure}
  
-\begin{figure}
-  \centering
-  \includegraphics[scale=0.5]{images/non-chaotic_trace2} 
-  \caption{Second coding scheme - Predictions obtained for a non-chaotic test subset.}
-  \label{Fig:non-chaotic_predictions}
+
+Les résultats concernant le second codage  (\textit{i.e.},  avec les codes
+de   Gray) sont synthétisés dans le tableau~\ref{tab2}. On constate 
+que le réseau apprend cinq fois mieux les comportement non chaotiques
+que ceux qui le sont. Ceci est est illustré au travers des 
+figures~\ref{Fig:chaotic_predictions} et~\ref{Fig:non-chaotic_predictions}.
+De plus, comme dans le codage précédent, les stratégies ne peuvent pas être 
+prédites.  
+On constate que ce second codage réduit certe le nombre de sorties, mais est 
+largement moins performant que le premier.
+On peut expliquer ceci par le fait
+que ce second codage garantit que deux entiers successifs correspondent 
+à deux configurations voisines, \textit{ie.e}, qui ne diffèrent que d'un 
+élément.
+La réciproque n'est cependant pas établie et deux configurations voisines
+peuvent être traduitent par des entiers très éloignés et ainsi difficils 
+àapprendre. 
+
+
+\begin{figure}[ht]
+  \begin{center}
+    \subfigure[Fonction chaotique $g$]{
+      \begin{minipage}{0.48\textwidth}
+        \begin{center}
+          \includegraphics[scale=0.37]{images/chaotic_trace2}
+        \end{center}
+      \end{minipage}
+      \label{Fig:chaotic_predictions}
+    }
+    \subfigure[Fonction non-chaotique $f$]{
+      \begin{minipage}{0.48\textwidth}
+        \begin{center}
+          \includegraphics[scale=0.37]{images/non-chaotic_trace2} 
+        \end{center}
+      \end{minipage}
+      \label{Fig:non-chaotic_predictions}
+    }
+  \end{center}
+  \caption {Prédiction lorsque les configurations sont exprimées 
+à l'aide de codes de Gray.}
  \end{figure}
  
-Unfortunately, in  practical applications the number  of components is
-usually  unknown.   Hence, the  first  coding  scheme  cannot be  used
-systematically.   Therefore, we  provide  a refinement  of the  second
-scheme: each  output is learned  by a different  ANN. Table~\ref{tab3}
-presents the  results for  this approach.  In  any case,  whatever the
-considered feedforward  network topologies, the  maximum epoch number,
-and the kind of iterations, the configuration success rate is slightly
-improved.   Moreover, the  strategies predictions  rates  reach almost
-12\%, whereas in Table~\ref{tab2} they never exceed 1.5\%.  Despite of
-this improvement,  a long term prediction of  chaotic iterations still
-appear to be an open issue.
-
-\begin{table}
-\caption{Prediction success rates for split outputs.}
-\label{tab3}
-\centering
-\begin{tabular}{|c||c|c|c|}
-\hline 
-\multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output, and one hidden layer} \\
-\hline
-\hline
-Epochs & 125 & 250 & 500 \\ 
-\hline
-\hline
-Chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
-\hline
-10~neurons & 12.39\% & 14.06\% & 14.32\% \\
-25~neurons & 13.00\% & 14.28\% & 14.58\% \\
-40~neurons & 11.58\% & 13.47\% & 14.23\% \\
-\hline
-\hline
-Non chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
-\cline{2-4}
-%Epochs & 125 & 250 & 500 \\ 
-\hline
-10~neurons & 76.01\% & 74.04\% & 78.16\% \\
-25~neurons & 76.60\% & 72.13\% & 75.96\% \\
-40~neurons & 76.34\% & 75.63\% & 77.50\% \\
-\hline
-\hline
-Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
-\cline{2-4}
-%Epochs & 125 & 250 & 500 \\ 
-\hline
-10~neurons & 0.76\% & 0.97\% & 1.21\% \\
-25~neurons & 1.09\% & 0.73\% & 1.79\% \\
-40~neurons & 0.90\% & 1.02\% & 2.15\% \\
-\hline
-\multicolumn{4}{c}{} \\
-\hline
-Epochs & 1000 & 2500 & 5000 \\ 
-\hline
-\hline
-Chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
-\hline
-10~neurons & 14.51\% & 15.22\% & 15.22\% \\
-25~neurons & 16.95\% & 17.57\% & 18.46\% \\
-40~neurons & 17.73\% & 20.75\% & 22.62\% \\
-\hline
-\hline
-Non chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
-\cline{2-4}
-%Epochs & 1000 & 2500 & 5000 \\ 
-\hline
-10~neurons & 78.98\% & 80.02\% & 79.97\% \\
-25~neurons & 79.19\% & 81.59\% & 81.53\% \\
-40~neurons & 79.64\% & 81.37\% & 81.37\% \\
-\hline
-\hline
-Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
-\cline{2-4}
-%Epochs & 1000 & 2500 & 5000 \\ 
-\hline
-10~neurons & 3.47\% & 9.98\% & 11.66\% \\
-25~neurons & 3.92\% & 8.63\% & 10.09\% \\
-40~neurons & 3.29\% & 7.19\% & 7.18\% \\
-\hline
-\end{tabular}
-\end{table}
  
  \section{Conclusion}
-
-In  this paper,  we have  established an  equivalence  between chaotic
-iterations,  according to  the Devaney's  definition of  chaos,  and a
-class  of multilayer  perceptron  neural networks.   Firstly, we  have
-described how to build a neural network that can be trained to learn a
-given chaotic map function. Secondly,  we found a condition that allow
-to check whether  the iterations induced by a  function are chaotic or
-not, and thus  if a chaotic map is obtained.  Thanks to this condition
-our  approach is not  limited to  a particular  function. In  the dual
-case, we show that checking if a neural network is chaotic consists in
-verifying  a property  on an  associated  graph, called  the graph  of
-iterations.   These results  are valid  for recurrent  neural networks
-with a  particular architecture.  However,  we believe that  a similar
-work can be done for  other neural network architectures.  Finally, we
-have  discovered at  least one  family of  problems with  a reasonable
-size, such  that artificial neural  networks should not be  applied in
-the  presence  of chaos,  due  to  their  inability to  learn  chaotic
-behaviors in this  context.  Such a consideration is  not reduced to a
-theoretical detail:  this family of discrete  iterations is concretely
-implemented  in a  new steganographic  method  \cite{guyeux10ter}.  As
-steganographic   detectors  embed  tools   like  neural   networks  to
-distinguish between  original and stego contents, our  studies tend to
-prove that such  detectors might be unable to  tackle with chaos-based
-information  hiding  schemes.
-
-In  future  work we  intend  to  enlarge  the comparison  between  the
-learning   of  truly   chaotic  and   non-chaotic   behaviors.   Other
-computational intelligence tools such  as support vector machines will
-be investigated  too, to  discover which tools  are the  most relevant
-when facing a truly chaotic phenomenon.  A comparison between learning
-rate  success  and  prediction  quality will  be  realized.   Concrete
-consequences in biology, physics, and computer science security fields
-will then be stated.
-
+Dans ce chapitre, nous avons établi une simlilitude entre les itérations 
+chaotiques et une famille  de perceptrons multicouches.
+Nous avons d'abord montré comment  construire un réseau de neurones 
+ayant un comportement chaotique.
+Nous avons présenté ensuite comment vérifier si un réseau de neurones 
+établi était chaotique.
+Nous avons enfin montré en pratique qu'il est difficile pour un 
+réseau de neurones d'apprendre le comportement global d'itérations
+chaotiques.
  % \appendix{}
  
  % \begin{Def} \label{def2}